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1 .1  Background  on  Eigenproblems 

The  simplest  elgenproblem  can  be  stated  in  the  form 


A  ti  -  Xiei  (1-1) 
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and  is  a  scaler.  The  vector  e^  is  called  an  eigenvector  and  the  scaler 

\ 

is  called  an  eigenvalue.  We  will  deal  almost  exclusively  with  the  m  ■  n 
case . 


We  must  make  two  observations  about  eigenproblems.  First,  they  are  a 
special  case  of  a  more  general  and  powerful  matrix  analysis  technique  called 
singular  value  decomposition  or  SVD.  We  will  deal  with  SVD  as  well  in  this 


report.  Second,  eigenproblems  are  of  vital  interest  to  the  Air  Force  for  several 
reasons.  We  note,  as  an  example,  how  to  use  eigen  solutions  in  antenna  array 
processing. 

In  the  selected  example  the  antenna  element  weights  (i.e.,  amplitude  and/or 
phase  adjustments)  are  to  be  found  that  steer  a  static  multi-element  antenna  so 
as  to  maintain  maximum  signal-to-noise  ratio  (S/N)  reception.  As  shown  in  Figure 
1.1,  each  of  m  antenna  elements  receives  (at  a  given  time)  a  signal  and  a  noise 
contribution,  and  these  generally  complex  contributions  form  the  m  dimensional 
signal  and  noise  vectors  s  and  n,  respectively.  Each  antenna  element  is 
connected  through  a  generally  complex  weight,  and  these  weights  form  the  vector 
w.  Note  from  Figure  1.1  that  the  weighted  contributions  from  all  antenna 
elements  are  summed  to  form  the  output  (complex)  signal  whose  S/N  is  to  be 
maximized.  This  S/N  may  be  expressed  as  shown  in  terms  of  the  time-averaged 
squared  moduli  of  the  signal  and  noise  parts  of  the  output  signal,  and  the 
resulting  expression  may  then  be  reduced  to  the  indicated  eigenvalue  equation. 

The  matrix  (in  brackets)  in  this  equation  may  be  expressed  as  shown  in  terms  of 
the  time  averaged  outer  products  R^  and  Rss  of  the  noise  and  signal  vectors 
respectively;  these  vectors  are  known  from  measurements  on  the  unweighted  output 
of  each  element.  Thus  the  eivenvalue  equation  may  be  solved,  and  the  eigenvector 
associated  with  the  dominant  (or  maximum)  eigenvalue  will  give  the  weights  that 
maximize  the  S/N. 

A  computer  simulation  of  the  method  that  would  be  used  by  an  optical 
systolic  matrix  vector  system  to  solve  the  eigenproblem  described  above  was 
carried  out.  This  simulation  required  the  specification  of  certain  signal  and 
noise  statistics  in  accordance  with  practical  expectations.  Discussions  with 
RADC  experts  familiar  with  such  expectations  led  to  the  selection  of  a  m  =  128 
antenna  element  problem  with  an  average  S/N  on  the  order  of  unity  at  each 
element.  A  bimodal  signal  distribution  across  the  antenna  elements  was  selected 
so  that  Rss  -  S’S,T  +  8"  s"T,  where  the  lognormal  distributions  s^  - 
( 1/k)  exp  [-(In  k  -  ^)2/(2ct2)1  and  s£  -  ®i29-k  were  used  with  64  exp  (4  + 
a2/ 2),  642  -  (exp  a2  -  1)  •  exp  (2^  +  a2),  s'  -  1,  and  k  -  1,  2,  ...,  128. 


The  matrix  R^  was  selected  to  correspond  to  uncorrelated  Gaussian  noise  so 
that  R  ■  rm^  +  a2!,  where  n.  ■  n  ■  1,  a2  •  E(s'  -  si),  and  I  is  the  identity 

n  S  iC  S  K  K. 

matrix.  Explicit  matrix  inversion  was  avoided  in  forming  the  eigenvalue  equation 

matrix  M  «  R'1  R  by  using  the  expression14 
nn  ss 
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R-1  -  D-1  -  D-1  nnT  D_1/(l  +  nT  D_1  n)  ,  (1-4) 

nn 

where  D  *  o2I.  Note  that  M  is  a  real,  symmetric,  positive  definite  matrix  which 
will  therefore  have  a  set  of  real,  positive  eigenvalues.  In  general  M  and  the 
weight  eigenvector  will  be  complex,  but  this  case  may  be  divided  into  separate 
real-part  and  imaginary-part  eigenproblems  of  the  form  described  above. 

The  computer  simulation  used  the  same  power  method  that  would  be  used  by  a 
typical  optical  systolic  matrix  vector  system  to  obtain  the  eigenvector 
solution.  The  power  method  iterates  matrix-vector  multiplication  operations,  and 
the  simulation  determined  that  N  -  35  such  iterations  were  required  to  obtain  the 
dominant  eigenvalue  and  associated  eigenvector  of  the  128  x  128  matrix  M  to  a 
precision  of  10-4  .  The  total  time  required  to  perform  each  matrix-vector 
multiplication  iteration  is,  according  to  Table  1-1  and  the  discussion 
approximately  Tm  -  23  jxs.  At  a  100  MHz  clock  rate  the  initial  matrix  input 
time  is  the  time  required  to  read  in  the  128(128  +  1 ) / 2  symmetric  matrix  elements 
or  approximately  T^  -  83  ^s,  the  final  weight  eigenvector  output  time  for  128 
vector  elements  is  Tr  -  1.3  ^s,  and  the  test  for  convergence  time  is  Tc  * 

0.01  ^s.  Thus  the  total  OSAP  system  eigenvector  solution  time  is  approximately 
T  •  T^  +  N(Tm  +  Tc)  +  Tr  *  0.89  ms.  Note  that  the  basic  block-floating¬ 
point  computation  rate  is  approximately  2n2/Tm  ■  2  (128)2/(24  ^s)  or  about  1.4 
GigaFLOPS  <1.4  x  109  Floating  Point  Operation  per  Second). 

The  same  eigenvector  solution  could  be  obtained  by  a  state-of-the-art  5 
MegaFLOPS  all  electronic  board  level  array  processor.  In  this  case  the  matrix 
input  and  eigenvector  output  times  would  remain  approximately  the  same,  but  the 
matrix  vector  multiplication  for  each  iteration  would  require,  in  general,  2n^  » 
2(128)2  multiply  and  add  operations.  Since  each  operation  would  require  0.2  us 
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at  the  5  MegaFLOPS  rate,  the  total  matrix  vector  multiplication  time  would  be 
approximately  Tm  -  6.6  ms.  Thus  the  total  non-Optical  Systolic  Array  Processor 
(OSAP)  system  eigenvector  solution  time  would  be  approximately  T  *  35  (6.6  ms)  - 
230  ms,  which  is  about  250  times  longer  than  the  Optical  Systolic  Array  Processor 
(OSAP)  system  solution  time  estimated  above.  This  comparison,  which  is  displayed 
in  Table  1-1,  does  not  take  into  account  the  time  required  to  calculate  the 
matrix  M  given  the  vectors  s  and  n.  Assuming  all  electronic  array  processing  at 
a  5  MegaFLOPS  rate,  this  time  would  be  on  the  order  of  13  ms  since  the  equivalent 
of  roughly  4(128)2  multiply  and  add  operations  are  involved  in  the  calculation 
(which,  as  mentioned  above,  does  not  involve  explicit  matrix  inversion).  Thus 


Table  1-1  -  Optical  Systolic  Array  Processor  System  Application  Example 


Performance 

and  Comparison 

OSAP  System 

All  Electronic  Board 
Level  Array  Processor 

Symmetric  matrix 
read-in  time 

83  ps 

83  ps 

General  matrix-vector 
multiplication  time 

23  ps 

6 . 6  ms 

Dominant  eigenvector 
readout  time 

1.3  ps 

1.3  ps 

Total  eigenvector  V 

Solution  time* 

0.89  ms 

230  ms  ' 

Typical  arithmetic 
operation  rate 

1 _ 

1.4  GigaFLOPS 

5.0  MegaFLOPS 

♦Includes  read-in  and  read-out  times  and  the  time  to  execute  the  35  iterations 
required  to  obtain  the  129  eigenvector  elements  to  a  precision  of  10-1* . 

even  if  the  M  matrix  calculation  time  is  included  in  the  comparison,  the  OSAP 
system  solution  time  is  still  much  less  than  the  non-Optical  Systolic  Array 
Processor  (OSAP)  system  solution  time.  A  separate  Optical  Systolic  Array 
Processor  (OSAP)  system  calculation  of  the  M  matrix  could  also  be  carried  out,  in 
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which  case  the  Optical  Systolic  Array  Processor  (OSAP)  system  eigenvector 
solution  time  would  be  at  least  two  orders  of  magnitude  less  than  the  non-Optical 
Systolic  Array  Processor  (OSAP)  system  solution  time.  Hundreds  of  all  electronic 
board-level  array  processors  working  in  parallel  might  match  the  Optical  Systolic 
Array  Processor  (OSAP)  system  computation  speed,  but  only  at  considerable  expense 
in  size,  power  consumption,  reliability,  etc. 

The  specific  adaptive  antenna  array  processing  example  considered  above 
clearly  shows  the  potential  of  the  Optical  Systolic  Array  Processor  (OSAP) 
system.  In  some  applications  (e.g.,  future  millimeter  wave  adaptive  arrays  on 
tactical  aircraft)  an  antenna  array  steering,  time  for  S/N  maximization  of  less 
than  10  milliseconds  may  be  required  for  arrays  of  more  than  100  elements.  The 
Optical  Systolic  Array  Processor  (OSAP)  system  would  be  of  unique  value  as  an 
enabling  technology  in  such  cases,  and  there  is  little  doubt  that  an  operational 
Optical  Systolic  Array  Processor  (OSAP)  system  would  have  a  similar  enabling  role 
in  a  broad  range  of  other  applications. 

1.2  Precontract  Background  Eigenproblem  Algorithm 

A  group  of  optics  workers  from  Aerodyne,  Stanford,  and  Georgia  Tech, 
published  the  first  paper  on  optical  solutions  to  eigenproblems  (Appendix  A). 

This  paper  led  to  this  contract  as  well  as  to  much  research  elsewhere  on  the  same 
and  related  subject.  The  basic  idea  is  extremely  simple.  We  start  with  any 
vector  Xq .  We  can  show  that  the  set  of  n  eigenvectors  {e-j}  forms  a  complete 
set,  so  we  can  write 

5  -  a.  e.  +  a,  e_  +  ...  +  a  e  .  (1-5) 

01122  nn 

Calling 

*m+l  "  A  2.  ••*) 


d-6) 


we  have 


±  m  >  m  -* 

&m  “  X^  X2  e2  + 


+  a  \  e 
n  n  n 


Clearly  (except  for  the  case  of  degenerate  eigenvalues  dealt  with  in  Appendix  A 
and  elsewhere  in  this  report)  for  large  enough  m  we  can  approximate 


m  =  \  \  \ 


where 


<\k)  >  (\L)  (1-9 

for  all  1  *  k. 

That  is,  by  raising  all  of  the  eigenvalues  to  successively  nigher  powers  we 
reach  a  point  at  which  one  eigenvalue  dominates.  Hence  this  is  called  the  power 


method . 


Usually  we  set 


-  T  ♦  . 

e,  e,  -  1 
k  k 


(1-10 


where  the  superscript  T  Indicates  transposition.  Having  e^,  we  find  using 
Eq.  (1-1). 


1.3  Precontract  Status  of  the  Optical  Matrix  Processor 


The  optical  processor  we  conceived  of  using  was  the  Stanford  processor 
(Appendix  B).  The  reason  was  very  simple:  there  was  no  alternative.  The 


Stanford  processor  was  the  beginning  and  the  prototype,  but  it  is  clear  in 
retrospect  that  its  two  major  drawbacks  were 

1.  Totally  analog  operation  and  hence  very  limited  accuracy  and 

2.  The  necessity  of  using  a  two-dimensional  spatial  light  modulator  to 
allow  changing  of  the  matrix. 

1 .4  Precontract  Goals  and  Approaches 


The  explicit  overall  goal  of  this  contract  was  to  use  optical  methods  to 
solve  eigenproblems  rapidly  and  with  "sufficient  accuracy."  Naturally  we  held 
closely  to  that  goal. 

The  precontract  approach  was  to  implement  the  algorithm  of  Subsection  1.2  in 
the  processor  of  Subsection  1.3.  As  we  began  to  do  this,  we  found  that  both 
approaches  essentially  guaranteed  failure  to  meet  the  overall  goal.  Accordingly 
we  set  out  to  improve  both  the  algorithm  and  the  hardware.  Both  were 
accomplished,  and  we  can  now  show  that  extremely  useful  optical  eigenproblem 
solvers  can  be  constructed  showing  advantages  in 

o  Size, 

o  Weight,  and 

o  Power  consumption 

over  electronic  processors  having  tn  :  same  extremely  high  speed  or,  conversely, 
advantages  in  speed  over  electronic  computers  of  the  same  size,  weight,  and  power 
consumption. 
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2.  REPORT  APPROACH  AND  RATIONALE 


A  great  deal  of  productive  work  was  done  under  this  contract.  Therefore, 
telling  the  whole  story  as  a  continuous  narrative  runs  the  risk  of  hiding  the 
coherence  of  the  effort.  Accordingly,  we  have  chosen  to  relegate  to  appendices 
detailed  discussions  which  were  either  published  or  prepared  for  publication 
under  this  contract.  The  text,  therefore,  serves  as  a  comprehensible  guide  to 
and  through  these  various  individual  efforts  and  concludes  with  an  attempt  to  tie 
summarize  of  these  efforts  as  they  relate  to  the  contract  goal  is  ennunciated  in 
Subsection  1.4. 


3.  PROBLEMS  WITH  THE  PRECONTRACT  ALGORITHM  AND  PROCESSOR  APPROACHES 
3 . 1  Algorithm  Approach 

Early  in  the  contract  we  noted  several  major  problems  with  the  power  method 
as  described  In  Subsection  1.2.  We  summarize  these  briefly  here.  First, 
convergence  might  be  very  slow.  Suppose  the  two  highest  eigenvalues  are  \(l+e) 
and  X,  where  0  <  e  «  1.  Clearly  convergence  is  not  achieved  until  the  iteration 
m  in  which 

[  X.(  1+e)  ]”  »  [\]m  (3-1) 

or 

( 1+e ) m  »  1  .  (3-2) 

We  have 

( 1+e ) m  =  1+me  ,  (3-3) 

or 


m  »  1/e  .  (3-4) 

There  is  reason  to  believe  that  we  may  not  need  m  =•  106  or  more.  Even  the  speed 
of  optics  might  not  overcome  this  disadvantage  so  well  as  to  make  it  superior  to 
electronics.  Second,  the  original  approach  (Appendix  A)  did  not  include  a  truly 
satisfactory  way  of  finding  eigenvectors  beyond  the  first  one.  The  general 


11 


I 


problem  is  called  deflation  -  removing  all  previously-calculated  eigenvector 
information  from  the  problem. 

3 . 2  Optical  Processor  Approach 

The  two  drawbacks  of  the  original  Stanford  processor  for  the  goals  of  this 
contract  were  noted  in  Subsection  1.3.  Here,  we  want  to  discuss  in  more  detail 
why  analog  processing  must  be  abandoned.  For  any  processor  we  can  argue  that  the 
solution  obtained,  while  an  Inaccurate  answer  to  the  problem  posed,  is  a 
perfectly  accurate  solution  to  another  problem  (an  inaccurately-posed  problem). 
Following  this  line  of  reasoning,  mathematicians  have  been  able  to  cast  most 
linear  algebra  accuracy  problems  in  the  following  manner.  The  average  error  in 
the  answer,  e(x),  is  related  to  the  average  error  in  the  calculation  itself, 
e  ( c ) ,  by 

e(x)  =  cond  (A)  e(c)  ,  (3-5) 

where 

cond  (A)  *  "condition  number" 

of  the  matrix.  The  condition  number  is  the  ratio  of  the  largest  to  the  smallest 
eigenvalue.  In  the  case  of  antenna  arrays  with  jammers  in  the  field,  the 
condition  number  of  the  matrix  of  interest  can  easily  be  106.  On  the  other  hand 
the  calculation  error  of  a  super  analog  optical  processor  might  be  e(c)  «  10-2. 
This  suggests  that  the  results  of  optical  eigenproblera  solvers  might  be 
essentially  meaningless.  This  is  why  analog  electronic  computers  have  been 
largely  abandoned  in  favor  of  digital  electronic  computers.  This  Is  also  why  we 
too  soon  abandoned  analog  computers  in  favor  of  digital  ones.  Optical  digital 
computers  will  be  slower  and  more  expensive  than  optical  analog  computers,  but  we 
have  no  alternative  when  solving  eigenproblems  optically. 
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4.  A  DIGRESSION  ON  ANALOG  OPTICAL  COMPUTERS 


While  considering  and  rediscovering  these  concerns  with  optical  analog 
computers,  we  developed  a  totally  new  way  to  use  analog  matrix  processors.  This 
new  approach  offers  significant  speed  and  convergence  advantages  over  methods 
borrowed  directly  from  the  digital  computer  literature.  We  do  not  belabor  these 
advantages  here,  because  we  have  now  abandoned  analog  methods  for  this  problem. 
Our  work  in  this  area  is  shown  in  Appendices  C  and  D. 


5.  ALGORITHM  IMPROVEMENTS 


Having  noted  the  problems  with  the  precontract  algorithm  in  Subsection  3.1, 
we  now  describe  our  successful  efforts  to  solve  those  problems.  The  convergence 
was  accelerated  greatly  by  going  to  another  type  power  method.  We  explain  it 
crudly  here  and  in  much  more  detail  in  Appendix  E.  The  explanation  here  will  be 
in  terms  of  matrix-matrix  multipliers  but  we  show  in  Appendix  E  that  much  the 
same  advantage  can  even  be  extended  to  matrix-vector  multipliers. 

A  matrix  can  be  expanded  in  terms  of  its  eigenvectors.  The  eigenvectors 
themselves  are  orthonormal,  i.e.. 


1  if  i  -  j 

iA  -  ^ij  "  {  } 

J  J  0  if  U  j 


+  T  > 
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(5-1) 


The  outer  product  of  a  single  eigenvector  is 
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We  can  write 


We  now  evaluate  A£.  It  will  contain  homogeneous”  terms  like 
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Thus 
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Squaring  again  we  get  A4 ,  etc.  After  m  squarings  we  obtain 


.m 


2m  +  -►  T  .  2  T  .  ^  %2  ♦  •*  T 

X1  el  el  +  ^  *2  *2  +  +  \  Cn  ®n 


(5-7: 


Thus,  for  example,  10  squarings  leads  to  raising  the  A^'s  to  the  power  1024! 

Thus  the  convergence  has  been  improved  tremendously.  Of  course  once  we  conclude 


X, 


2m 


«k  «k 


(5-s: 


we  can  extract  e^  by,  for  example,  projecting  along  either  the  rows  or  the 
columns. 

The  other  problems  with  the  power  method  were  also  successfully  attacked  in 
Appendix  E,  but  the  required  explanations  are  too  tedious  for  the  text. 


In  the  process  of  working  on  this  matrix  squaring  algorithms,  we  also  made 
some  important  observations  and  innovations  regarding  Singular  Value 
Decomposition  (SVD).  This  work  is  shown  in  Appendix  F  and  will  be  referred  to  in 
more  detail  later. 
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6.  ACCURACY  ISSUES 


As  noted  in  Subsection  3.2,  the  major  non-algorithm  issue  in  accomplishing 
the  contract  goals  is  accuracy.  We  have  attacked  the  accuracy  issue  in  several 
ways.  We  examine  these  complementary  approaches  below. 

6 . 1  Matrix  Reconditioning 

This  is  an  important  but  rather  subtle  method  suggested  in  Appendices  E  and 
F.  The  basic  idea  is  rather  simple.  There  may  be  a  way  to  replace  A  by  an 
"approximate  matrix"  A'  such  that,  for  the  given  calculation  accuracy,  we  get 
closer  to  the  true  answer  by  using  the  less-accurately-posed-but-better- 
conditioned  matrix  A  than  by  using  the  more-accurate-but-worse-conditioned  matrix 
A.  As  an  "existence  proof"  we  showed  how  to  remove  a  singularity  from  A; 
converting  an  unsolvable  problem  into  a  solvable  one!  We  show  here  only  the 
basic  ideas. 

The  SVD  of  A  can  be  written 


s  $  $  1  +  s  $  $  1  +  ...  +  s  $  $ 

111  222  nnn 


(6-1) 


where,  by  convention, 

8,  >  s„  >  ...  >  s  ,  (6-2) 

i  —  l  —  —  n 

The  scalar  s^  is  called  the  k+h/  singular  value  and  ^  is  called  the  k+n/ 
singular  vector.  For  symmetric  A,  Eqs.  (5-3)  and  (6-2)  are  equivalent.  One  of 
many  interesting  properties  of  the  SVD  is  that,  in  a  meaningful  and  well-defined 
sense,  the  best  l  <  n  outer  product  expansion  of  A  is 


Furthermore,  the  "goodness  of  fit"  is  given  by 

G(A)  -  (s*  +  s22  +  ...  +  S*)/(s*  +  s*  +  ...  +  s*)  .  (6-4) 

It  appears  reasonable  to  choose  l  such  that 

GU)  -  e(c)  .  (6-5) 

All  of  this  is  quite  reasonable  and,  unfortunately,  usually  impractical. 

The  reason  is  that  the  SVD  is  seldom  given  and  is  very  difficult  to  calculate  - 
usually  much  more  difficult  than  the  eigenproblem  we  set  out  to  solve.  Hence  we 
set  out  to  invent  an  Approximate  Singular  Value  Decomposition  (ASVD)  method  which 
is 

o  Very  easy  to  calculate  and 

o  Leads  to  an  approximate  matrix  which  is  between  the  original  matrix  A 
and  the  optimum  approximation  a(£). 

The  ASVD  is  discussed  in  Appendix  G. 

6.2  Digital  Optical  Processing 

Aerodyne  did  not  invent  optical  digital  processing,  but  it  has  made  some 
advances  in  this  area.  The  history  of  this  field  is  described  in  Appendix  H. 

The  basic  concept  is  to  encode  a  digital  number  by  a  string  (in  space,  time,  or 
both)  of  analog  signals.  By  a  judicious  encoding  we  can  achieve  high  overall 
accuracy  without  overtaxing  the  dynamic  range  of  any  analog  channel. 

Aerodyne's  contribution  to  this  effort  in  this  contract  was  to  develop  an 
arithmetic  well  suited  to  optical  digital  computing.  In  optics,  since  we  are 


using  analog  channels,  there  is  no  a  priori  reason  to  restrict  ourselves  to  radix 
2  numbers.  With  binary  numbers  only,  one  extra  digit  is  needed  to  carry  the  sign 
information  which  converts  a  non-negative  amplitude  (e.g.,  16  bits)  into  a  real 
(positive  or  negative)  number.  In  any  other  base  there  appears  to  have  been  no 
way  to  do  this  same  thing.  We  have  invented  a  new  arithmetic  which  (a)  solves 
the  problem  and  (b)  reduces  to  the  known  result  for  binary  numbers.  Details  are 
given  in  Appendix  I.  Here  we  simply  illustrate  the  result  for  decimal  (radix  10) 
numbers  in  the  range  -99  to  +99. 

For  a  positive  number,  e.g.,  5,  we  write 

+  5  -  505 

where  5  is  the  sign  digit  which  can  be  any  of  the  following  digits:  0,  2,  4,  6, 

8. 

For  a  negative  number,  e.g.,  -8,  we  first  complement  the  magnitude  (subtract 
it  from  100)  to  obtain  92  and  write 

n  -  592 

where  5  -  1,  3,  5,  -7,  or  9.  All  5  values  are  equally  valid. 

We  now  show  side  by  side  ordinary  decimal  operations  and  operations  in  the 
new  arithmetic 
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6 • 3  Floating  Point  Operation 


Simple  fixed  point  arithmetic  aa  conceived  of  in  all  other  optical  digital 
computers  will  probably  be  Inadequate  for  many  Air  Force  needs.  Like  their 
electronic  counterparts ,  optical  computers  need  floating  point  operations.  Under 
this  contract,  Aerodyne  devised  the  only  two  floating  point  systems  yet  proposed 
for  optical  computers.  One  method  (Appendix  J)  computes  magnitudes  and  exponents 
independently  and  accumulates  magnitudes  on  exponent-determined  detectors.  The 
other  method  (Appendix  K)  uses  a  simultaneous  spatial  encoding  for  the  same 
purpose. 
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7.  HARDWARE  CONSIDERATIONS 


The  interest  in  optical  systolic  array  processing  developed  around  the 
country  in  parallel  with  the  work  on  this  contract  and,  in  fact,  stimulated  by 
the  work  on  this  contract  (see  Appendix  H).  On  this  contract  "only"  one  new 
hardware  approach  was  developed  and  a  new  way  of  using  electronics  in  iterative 
linear  algebra  problems  was  described. 

7.1  The  RUBIC  Cube 

Invented  under  this  contract  in  the  course  conversations  with  employees  of 
the  Naval  Ocean  Systems  Center,  the  Rabid  Unbiased  Bipolar  Incoherent  Calculator 
(RUBIC)  cube  is  a  fully  three-dimensional  systolic  matrix-matrix  multiplier 
(Appendix  L).  The  basic  idea  is  to  use  two  CCD  shifting  spatial  light  modulators 
(as  made  by  Lincoln  Labs,  or  as  could  be  made  by  Hughes)  to  move  two-dimensional 
data  in  such  a  way  that  the  proper  data  are  registered  on  proper  detectors  at  the 
proper  time.  As  the  proper  two-dimensional  spatial  light  modulators  were  not 
available  to  us,  we  simulated  the  RUBIC  cube  with  moving  masks.  That  is,  the 
problem  was  not  that  the  needed  components  were  too  expensive  or  impossible. 

They  were  simply  not  available  for  sale  or  use.  Indeed,  working  with  Hughes,  we 
showed  that  they  could  be  built  (Appendix  M).  Moving  black  and  transparent  masks 
allowed  us  to  test  the  other  hardware  of  a  RUBIC  cube.  We  tested  squaring  (the 
key  operation  in  eigen  solution  as  previously  noted)  for  a  64  x  64  tridiagonal 
matrix  of  l's  along  the  diagonal  and  neighboring  elements  and  0's  elsewhere. 

That  is,  the  matrix  is 
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These  data  are  rearranged  so  that  the  lef t-to-right  flow  follows  Figure  7.1  and 
the  up-to-down  flow  follows  Figure  7.2*  The  l's  are  the  clear  (white)  regions 
while  the  0's  are  black.  Figure  7.3  shows  the  two  data  sets  immediately  before 
they  enter  the  region  of  the  detector  array  and  before  the  first  light  pulse. 
The  first  data  pulse  involves  some  overlap  as  indicated  in  Figure  7.4.  The 
second  pulse  involves  more  overlap  as  shown  in  Figure  7.5.  At  the  end  of  the 
second  pulse,  the  (1,1)  component  of  A2  has  been  computed.  The  overall  result 
should  be 
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The  question  we  examined  is  light  source  and  detector  variability  effects. 

We  found  that  we  could  not  approach  1%  overall  uniformity  in  lighting  without 
diffusing  the  light  so  badly  as  to  be  quite  inefficient.  Relief  by  photographic 
precorapensation  is  clearly  possible.  We  believe,  however,  that  independent  a 
posteriori  gain  control  on  each  detector  is  the  proper  approach.  For  an  NxN 
detector  array  at  most  N  at  a  time  must  be  read  out;  so  sequential  switches,  N 
amplifiers,  and  N  circulating  memories  can  accomplish  this.  It  follows,  as  well, 
that  the  same  mechanism  can  correct  for  typical  nonlinearities  (a  few  percent)  in 
detector  arrays  since  the  number  of  possible  "true  detector  values”  is  quite 
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small.  We  conclude  Chat  digital  optical  Implementation  of  matrix  squaring  is 
quite  feasible  vith  components  which  have  not  but  could  be  built.  The  speed, 
cost,  size,  and  power  advantages  relative  to  current  supercomputers  make  this 
appear  quite  worthwhile. 

7 .2  Iterative  Algorithm  Operation 

Two  basic  types  of  algorithms  can  be  devised  for  problems  such  as  least 
squares,  matrix  Inversion,  and  eigenproblems :  iterative  and  direct.  The  power 
method  we  have  chosen  is  an  iterative  method.  Direct  methods  require  a  foreknown 
number  of  cycles  but  (unlike  the  iterative  case),  direct  methods  require  full 
accuracy  in  each  cycle.  Thus  it  is  not  clear  a  priori  which  will  work  faster 
since  the  Iterative  schemes  can  use  fast  analog  electronics  (not  of  digital 
accuracy)  in  the  loop.  Thus  iterative  schemes  require  more  cycles  but  the  cycles 
can  be  faster. 

Under  this  contract  we  worked  out  the  feedback  logic  for  iterative  methods 
in  general  (Appendix  N).  In  the  power  method  one  nonlinear  step  is  required  in 
each  cycle:  a  renormalization  to  keep  the  result  from  growing  either  too  large 
or  too  small.  The  Aerodyne  approach,  among  other  things,  describes  what  may  be 
called  a  "lagging  renormalization"  method  which  allows  each  digit  to  be 
renormalized  and  recycled  in  the  same  clock  time  in  which  it  is  generated. 
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8.  CONCLUSION 


This  contract  began  with  an  inadequate  algorithm  to  be  implemented  in  an 
as-yet-unspecif led  manner  on  slow,  Inaccurate,  analog  optical  hardware.  It 
concluded  with  a  vastly  Improved  algorithm  which  can  be  implemented  by 
well-defined  methods  on  highly-accurate,  digital  optical  hardware.  The  need  now 
is  no  longer  to  determine  what  to  do  but  to  do  what  we  have  already  learned  how 
to  do  in  principle.  Significant  advantages  in  speed,  size,  cost,  and  power 
consumption  over  electronics  should  result. 
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Eigenvector  determination  by  noncoherent  optical  methods 


H.  J.  Caulfield,  David  Dvore,  J.  W.  Goodman,  and  William  Rhodes 


An  iterative  method  for  finding  the  eigenvectors  and  eigenvalues  of  a  matrix  via  incoherent  optical  matrix- 
vector  multiplication  and  simple  electronic  feedback  is  described. 


I.  Introduction 

A  variety  of  methods  have  been  developed  for  doing 
certain  simple  matrix  operations,  e.g.,  multiplying  a 
matrix  by  a  vector,  using  optical  methods.1,2  These 
methods  are  of  interest  because  they  perform  all  or  most 
of  the  required  operations  in  parallel  and  thus  poten¬ 
tially  offer  extremely  high  speed.  More  complicated 
matrix  operations  are  as  yet  extremely  difficult  to  carry 
out  by  optics.  The  finding  of  eigenvalues  and  eigen¬ 
vectors  of  large  matrices  is  quite  difficult  and  siow  by 
digital  methods.  Of  course,  the  matrix  of  eigenvalues 
can  be  used  to  invert  the  matrix,  so  solving  the  eigen¬ 
value  problem  is  tantamount  to  doing  matrix  inversion. 
An  iterative  approach  to  matrix  inversion  has  been  at¬ 
tempted  optically,3  but  it  requires  for  convergence  es¬ 
timation  of  the  largest  eigenvalue.  This  is  easily  done 
by  forming  the  square  root  of  the  squares  of  the  ele¬ 
ments  of  the  matrix.  We  offer  here  a  matrix  inversion 
method  of  somewhat  greater  generality.  In  particular, 
we  will  find  the  eigenvectors  and  eigenvalues  sequen¬ 
tially. 

II.  Method 

The  proposed  optical  approach  utilizes  an  iterative 
method  of  computing  eigenvalues  and  eigenvectors, 
known  in  linear  algebra  as  the  power  method,4,0  based 
on  the  orthogonality  of  the  eigenvectors.  This  method 
works  well  if  the  matrix  is  of  the  real  symmetric  form 
assumed  by  the  covariance  matrix  of  a  real  vector.  This 
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guarantees  real  eigenvalues.  We  have  no  general  test 
for  its  applicability  to  other  cases.  We  suppose  we  have 
am  N  X  N  matrix  M  of  rank  N  with  a  full  set  of  eigen¬ 
vectors  ei . e/v  and  corresponding  eigenvalues  Xlt 

.  . .  ,  \h-  We  assume  that  the  eigenvalues  are  not  re¬ 
peated  and  that  they  are  numbered  in  order  of  de¬ 
creasing  magnitude.  Thus 

|X||  >  |X?|  >  >  | A/v— 1 1  >  I A/v | .  ll) 

As  a  starting  point  we  choose  some  arbitrary  input 
vector 

V(oi  *  Ciei  +  . . .  +  c.ve.v-  (2) 

Multiply  V(0)  by  M  yields 

V„)  »  cArei  +  .  .  .  +  c\'Xjve,v.  (3) 

With  successive  such  matrix  multiplications,  we  obtain 
the  general  term 

v,„,  =•  MV,„_n  =  ciX"ei  +  .  . .  +  cyXyey.  (4) 

So  long  as  the  starting  vector  Vioi  contains  some  of  ei¬ 
genvector  ei  (for  which,  recall,  the  corresponding  ei¬ 
genvalue  is  greatest  in  magnitude),  the  first  term  of  Eq. 
(4)  comes  to  domination  after  a  sufficient  number  of 
iterations.  Thus  for  n  sufficiently  large,  we  have 

V,„,  =  c i X?e i .  15) 

Similarly,  with  an  additional  iteration,  we  have 

V„*„  ==  c|X,*lei,  1 6) 

and,  therefore, 

ii  ^  X| V,„,  i,l 

This  relationship  holds  on  a  component  by  component 
basis,  and  thus  the  value  of  Xi  can  be  solved  for.  The 
rate  at  which  the  process  converges  is  determined  by  the 

ratio  |X1|/|X2|. 

If  Xj  is  significantly  larger  or  smaller  than  unity  in 
magnitude,  V'(n)  may  become  unacceptably  large  or 
small  afte*  a  number  of  iterations,  and  in  practice  we 
must  normalize  at  each  iteration  to  keep  the  vectors  of 
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controlled  size.  Thus  we  might  obtain  an  output  after 
n  iterations  which  we  normalize  to  U(nl  so  that  |  U , „ , i 
=  1.  Multiplying  U,„,  by  M  produces  an  output 
Wlo  +  ll.  We  normally  would  normalize  W,n  +  1,  to 
L’n  +  i\.  but  if  in  +  1)  is  the  terminal  iteration,  we  can 
write 

w1B+„  =  (8) 

To  check  for  termination  we  compare  the  values  of  W<„, 
and  W,n  +  n  either  on  a  magnitude  or  a  component- 
by -component  basis.  If  the  percentage  change  is  ac¬ 
ceptable,  we  terminate  the  iteration. 


HI.  Implementation 

It  is  clear  that  we  need  an  optical  matrix  multiplier 
for  speed  with  certain  rapid  electronic  processing  be¬ 
tween  matrix  multiplications.  Figure  1  shows  the 
configuration  in  schematic  terms.  The  optical  matrix 
multiplier  devised  by  Goodman  et  al.  seems  to  be 
ideally  suited  for  this  purpose.  The  feedback  method 
of  Psaltis  et  al.3  is  based  on  Goodman's  method  and 
appears  to  have  all  the  necessary  components  to  im¬ 
plement  this  scheme. 

IV.  Representation  of  Bipolar  Quantities 

Because  we  want  to  use  nonnegative  definite  masks 
and  incoherent  light,  the  handling  of  negative  quantities 
requires  some  encoding  of  the  vectors  and  matrix  to 
achieve  monopolar  operation.  Let  the  matrix  be 


mu 

my  i 

m.v  ,v 

and  the  kth  input  vector  be  V*.  We  write 

M  =  V*  -  CO) 

where  M+  and  A/_  have  nonnegative  entries  only,  and 
the  convention  is  adopted  that 

m  *  „  =0  if  m^n  £  0  =  0  if  <  0.  CD 

Similarly,  let 

v,*,  =  v.v  -  vr»,  •  12) 

Then 

v,*.„  =  ,V/V,4l  1 13) 


Coharsnce 

Achieved’ 


Eigenvector J 
Eigenvalue  / 


Convergence 

Test 

(Analog) 


Eigenvalue 
Da  caret net  ton 
Analog) 


Fig  1.  Heart  of  the  eigenvector  analysis  device  is  the  optical  matrix 
multiplier.1-  Analog  circuitry  provides  the  required  feedback. 


which  contains  *2iV  nonnegative  components.  We  then 
operate  on  that  vector  by  a  new  rank  2.V  matrix: 

.  f.  vr  u- 1 


to  obtain 


I M-  .Vrj 


fiyi*i  * 


Note  that  neither  y(* ,  nor  B  has  negative  components, 
so  incoherent  optics  is  quite  adequate  to  represent  them 
both. 

V.  Finding  New  Eigenvectors  and  Eigenvalues 

We  suppose  the  first  K  -  1  eigenvalues  and  eigen¬ 
vectors  of  M  have  been  found  to  be  Xj,  ei,  Xj,  e2,  .  .  .  . 
Xff-i,  ey-|.  We  want  want  to  find  the  feth  eigenvalue 
and  eigenvector.  To  do  this  we  form  a  new  matrix: 

.V,  =  *!]'  t.w  -  \j)  (20) 

n  -  1 

where  II  is  the  matrix  product  operator,  and  /  is  the 
unity  matrix  (which  converts  all  vectors  into  them¬ 
selves).  We  suppose  .V/*.  operates  on  an  eigenvector  e 
of  M  having  an  eigenvalue  X.  Then 


v*,Ml  -  =  i ,vr  -  vr- 1  |v~.,„  -  v, 

=  [A/-V*-*.,,  .vr-vrK*;.| 

-  |A/*V, -»♦■.+  .14) 

Thus 

V.‘,M1  =  .vrv*w„  -v  .vf-vr,M„  US) 

V.-,.  .  =  ,M‘V-  +  .Vf-V.*,*,,.  ■181 

We  replace  the  vector  V,*.,  of  S'  real  components  with 
r  ew  vectors 


Thus  Mi,  and  \f  have  the  same  eigenvectors.  Note, 
though,  that  for  ei, .  .  ,  ey-i.  the  M k  eigenvalues  are 
zero.  As  our  method  tends  to  find  that  eigenvector  with 
eigenvalue  of  highest  absolute  value,  it  will  find  an  ei¬ 
genvector  e*  lie i,  .  ,  ey-i1.  Call  the  .V/,  eigenvalue 
for  e*  "Air".  Then  we  can  find  the  corresponding  M 
eigenvalue  X,.  by  solving  the  equation 

'll 


2264  APOLiED  OPTICS  Vol  20.  No  13  1  July  V38i 


« 


A-  1 


fhu>  ivf  t  an  i!nu  an  trie  eigenvectors  ana  eigenvalues 

i  W  sequentially 

VI.  Concluding  Remarks 

A  hybrid  electronic  and  incoherent  optical  approach 
tor  finding  eigenvalues  and  eigenvectors  of  matrices  has 
been  proposed.  The  optical  hybrid  appears  particularly 
attractive  because  of  the  extremely  high  speed  with 
which  the  iterative  matrix  multiplications  can  be  per¬ 
formed.’  Its  most  important  potential  application 
appears  to  oe  in  problems  in  which  the  rank  of  the  ma¬ 
trix  is  so  large  that  standard  digital  methods  are  too 
slow  Accuracy  required  for  complete  implementation 
of  the  processing  scheme  depends  on  the  ratio  of  the 
largest  eigenvalue  to  the  smallest  (the  condition  number 
of  the  matrix).  Specifically,  to  find  all  larger  ei¬ 
genvalues  A;.  Vj . \n_i  must  be  known  to  within  an 

error  of  <|.\,  |.  Variations  on  the  method  allow  some 
relaxation  in  accuracy  requirements.0  The  power 
method  proposed  here  suffers  from  many  of  the  range 
and  accuracy  problems  common  to  optical  processors. 
For  higher  accuracy  we  mignt  use  the  eigenvectors  de¬ 
termined  optica. '.y  as  inputs  for  a  few  iterations  of  the 
digitally  implemented  method.  In  so  doing  we  would 
utilize  the  optical  processor  for  speed  and  the  digital 
processor  for  numerical  accuracy. 

The  optical  matrix  multiplier  proposed1  J  has  the 
capability  of  handling  multiple  input  vectors  in  parallel. 
This  capability  should  be  of  advantage,  if  used  properly, 
to  allow  for  degenerate  eigenvalues.  If  two  eigenvalues 


are  identical,  unique  eigenvectors  are  no  longer  defined 
Rather,  any  vector  in  a  plane  defined  by  two  spanning 
vectors  is  an  eigenvector.  Of  course,  this  is  extendable 
to  more  than  two  eigenvalues.  We  conjecture  i  without 
proof)  that  by  starting  with  .V  orthogonal  vectors  we  can 
guarantee  at  least  \1  eigenvectors  for  each  .V/ -degen 
erate  eigenvalue.  This  is  an  automatic  check  for  ei¬ 
genvalue  degeneracy  as  well  as  an  automatic  generator 
of  spanning  vectors  for  the  corresponding  eigenvectors 
By  a  Gramm-Schmidt  process  we  can  orthogonaiize 
those  vectors  and  reduce  them  to  their  minimum 
number  M. 

This  work  was  performed  under  contrast  F19628- 
30-C-008T  Rome  Air  Development  Center.  Deputy  for 
Electronics  Technology.  Hanscom  AFB,  Mass.  01731. 
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An  incoherent  opt. cal  aau-process.n^  method  is  described,  which  has  the  potential  for  periorming  discrete  Fourier 
transforms  ot  .en^th  at  rates  far  exceeding  those  afforded  by  both  special -purpose  digital  hardware  and  repre¬ 
sentative  coherent  optical  processors. 


W  e  report  here  on  an  incoherent  optical  method  for 
performing  discrete  Fourier  transforms  tDFT’s),  which 
has  the  potential  for  an  extremely  high  data-throughput 
rate.  The  DFT  operation  may  be  viewed  as  a  process 
of  multiplying  an  input  vector  f  (consisting  of  .V  possi- 
bl\  complex-valued  input  samples)  times  an  .V  X  A' 
matrix  H  [the  n.m  th  element  being  exp*  -;2xnm/A01 
to  yield  an  output  vector  g  i consisting  of  the  A  complex 
Fourier  coefficients);  thus  we  desire  to  perform 

g  =  H  f .  1 1 ) 

Two  separate  issues  must  be  addressed  in  describing  the 
method  of  interest  here:  i  1 )  How  do  perform  the 
matrix  product  in  a  highly  parallel  and  fast  way0  (2) 
How  do  we  perform  complex  ar;t  imeric  using  inco¬ 
herent  light,  for  which  only  nonnegative  and  real 
quantities  (intensities)  can  be  manipulated? 

To  address  the  first  issue,  suppose  that  the  elements 
<>f  f  and  h  are  nonnegative  and  real.  Then  the  system 
shown  in  Fig  1  can  he  used  to  perform  the  matrix- 
vector  product  The  elements  of  f  are  entered  in  par¬ 
allel  bx  controlling  the  intensities  of  .V  light-emitting 
diodes  i  LED's i  Lenses /.iand  /..  image  the  LED  array 
horizontally  onto  the  matrix  mask  ,V/  whiie  spreading 
:he  light  from  any  single  LED  xertically  to  fill  an  entire 
column  of  the  matrix  mask  Lens  L  i  is  a  field  lens. 
The  matrix  mask  M  consists  of  .V  X  .V  subcells,  each 
containing  a  transparent  area  proportional  to  one  of  the 
matrix  elements  Lens  L  ,  is  a  cylindrical  lenslet  array, 
which  is  not  essential  to  the  operation  of  the  system  hut 
which  can  he  used  to  improve  light  efficiency.  Lens 
combination  L-,  collects  ail  light  from  a  given  row  and 
brings  it  to  focus  on  one  element  of  a  vert  ical  array  of  .V 
photodetectors.  Each  photodetector  measures  the 
value  of  one  component  of  toe  output  vector  g. 

To  permit  the  multiplication  o’  a  matrix  h  with 
■  omplex  elements  limes  a  vector  f  with  complex  ele¬ 
ment'.  we  decompose  each  of  these  quantities  as  fol¬ 
lows1  -■ 

f  -  f"  e  f 1  ■  cxpi  j2-  :;i  +  f' expi /4,t  3>. 

o  -  i,-  ■■  ►  ii  ■  ■  pypi  rj*  +  //  - ■  exp<  / 4 7r  A i . 

cj) 

where  f  f  ■  .  and  f  •"  each  consist  ot  A  real  and  non¬ 


negative  elements,  and  M'0t.  'h{]',  and  ‘/C2'  consist  of 
N  X  N  real  and  nonnegative  elements.  If  the  output 
vector  g  is  similarly  decomposed,  then  we  find  that  the 
overall  matrix-vector  product  can  be  expressed  as 


Thus,  complex  operations  can  be  performed  at  a  price 
of  a  factor  of  3  in  the  length  of  the  input  and  output 
vectors. 


s I  L,  L  J  M  L,  L, 


r  v  /  . ' 


(c) 

1  incoherent  optical  processor  configuration  (a), 
pictorial  view  <bh  top  view;  < c ) .  side  view’. 
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Simple  electronic  circuits  tor  producing  the  compo¬ 
nents  f rtl.  f 1 and  rJI  from  f  exist,1  as  do  simple  circuits 
for  producing  the  real  and  imaginary  parts  of  g  from 
g“'\  g,n.  and  g'-T 

Experiments  have  been  carried  out  to  verify  the 
ability  to  perform  complex  arithmetic.  The  source  was 
an  unfiltered,  linear-filament,  dear-envelope,  incan¬ 
descent  bulb.  The  SO  x  :!0  matrix  mask  used  to  per¬ 
form  a  10-point  DFT  is  shown  in  Fig.  2.  This  mask  is 
designed  so  that  the  three  entire  vectors  f 0I,  f1!,  and  f 21 


Fig  2  Matrix  mask  for  a  10-point  DFT. 


are  entered  side  by  side,  whereas  the  three  output 
components  g*l0),  g*111,  and  g*'2’  for  the  fcth  Fourier 
coefficient  appear  side  bv  side.  Thus  the  output  display 
shows  each  DFT  component  as  a  triplet  of  real  and 
nonnegative  components. 

For  this  experiment  the  input  functions  were  entered 
by  hand  as  masks  placed  against  the  matrix  mask,  and 
output  functions  were  detected  on  a  1024-element  Re- 
ticor.  CCD  detector  array.  Figure  3  shows  both  theo¬ 
retical  output  distributions  and  experimentally  ob¬ 
tained  output  distributions,  the  latter  being  photo¬ 
graphed  from  an  oscilloscope  display.  In  parts  (a)  and 
(b),  the  function  to  be  transformed  consists  of  the  se¬ 
quence  ( 1,0,0, 0,0,0,0, 0,0,0).  The  resulting  DFT  should 
be  entirely  real  and  of  constant  magnitude.  As  shown 
in  these  figures,  the  DFT  components  along  the  real  axis 
are  all  nonzero  and  equal,  whereas  the  components 
along  120°  and  240°  are  all  zero. 

In  parts  (c)  and  (d),  the  input  sequence  was  entirely 
real  and  constant.  The  DFT  consists  of  a  large,  real 
zero-frequency  component  (on  the  far  right),  followed 
by  triplets  of  equal  strength  for  all  other  DFT  compo¬ 
nents.  Some  thought  shows  that  any  DFT  component 
with  elements  g*'01.  g*1".  and  g*'21  exactly  equal  is 
equivalent  to  a  zero  result.  Hence  all  DFT  components, 
except  the  zero-frequency  component,  are  zero. 

Parts  te)  and  if)  show  the  results  when  the  entire 


o  ,  f 

F;g  t  T  hf  ■re’ii  hi  '■  1 1.  ,  i  > m 1 1  inti  experimental  [<bi.  idl.  if)]  DFT  results 
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matrix  mask  is  uniformly  illuminated.  In  this  case, 
some  thought  shows  that  the  input  is  effectively  a  se¬ 
quence  containing  all  zeros.  The  output  DFT  shows 
triplets  of  equal  strength,  or  a  sequence  of  all  zeros  for 
the  output 

A  system  composed  of  96  high-speed  LED’s  and  96 
avalanche  photodiodes  would  be  capable  of  performing 
a  12-point  DFT  Commercially  available  components 
have  sufficient  bandwidth,  output  power,  and  sensi¬ 
tivity  to  permit  such  a  DFT  to  be  performed  every  10 
nsec.  The  total  throughput  rate  for  such  a  processor 
is  about  .1  x  lO1*  complex  samples  per  second,  whereas 
a  corresponding  number  for  special-purpose  digital 
array  processors  is  about  3  X  10s  complex  samples  per 
second  and  a  representative  coherent  optical  processor5 
has  a  throughput  of  3  X  10"  real  samples  per  second. 

The  chief  significance  of  this  processor  is  that  the 
input  data  can  be  entered  in  parallel,  and  it  is  this  fact 
that  leads  to  its  high  throughput  rate.  Another  system 
recently  described4-5'  performs  a  similar  matrix-vector 
product,  but  the  data  must  be  entered  serially,  and  as 
a  consequence  the  throughput  rate  is  much  lower.  The 


processor  described  here  is  especially  well  suited  for 
problems  in  which  the  elements  of  the  input  vector  f  are 
gathered  by  parallel  sensors.  Of  course,  matrices  other 
than  the  DFT  matrix  can  also  be  used  if  desired 

This  work  was  supported  by  the  Office  of  Naval  Re¬ 
search. 

References 

1.  J.  W.  Goodman  and  L.  M.  Woody.  “Method  for  performing 
complex-valued  linear  operations  on  complex -valued  data 
using  incoherent  light,"  Appl.  Opt.  16,2611  (1977). 

2.  If  one  is  sufficiently  clever  in  eliminating  unwanted  terms 
at  the  output,  real  and  imaginary  components  on  biases  can 
be  used.  However,  the  dynamic  range  of  the  system  is 
reduced  by  such  an  approach. 

3.  We  refer  specifically  to  a  system  with  an  electron -beam 
addressed  DKDP  input  light  valve,  which  is  capable  of 
entering  106  data  points  30  times  per  second  See  D  Ca- 
sasent,  Proc.  IEEE  65,  143  (1977). 

4.  R.  P.  Bocker,  Appl.  Opt.  13,  1670  (1974). 

5.  M.  A.  Monahan,  K.  Bromlev,  and  R.  P.  Bocker.  Proc.  IEEE 
65,  121  (1977). 


B-4 


APPENDIX  C 

NEW  ANALOG  OPTICAL  COMPUTER 
FOR  ALGEBRAIC  EQUATIONS 


THIS  APPENDIX  HAS  INTENTIONALLY  BEEN  LEFT  BLANK. 


C-l 


APPENDIX  D 

APPLICATION  OF  THE  OPTICAL  PROCESSOR 
OF  APP.  C  TO  THE  EIGEN  PROBLEM 


THIS  APPENDIX  HAS  INTENTIONALLY  BEEN  LEFT  BLANK. 


APPENDIX  E 


ALGORITHM  IMPROVEMENTS  FOR  THE  EIGEN  PROBLEM 


Reprinted  from  Applied  Optics,  Vol.  22,  page  2075.  July  15.  1983 
Copyright  ?  1983  by  the  Optical  Society  of  America  and  reprinted  by  permission  of  the  copyright  owner 


Algorithm  improvements  for  optical  eigenfunction 
computers 

John  Gruninger  and  H.  J.  Caulfield 


Prior  iterative  approaches  to  optical  eigenfunction  solution  have  at  least  three  major  problems:  slow  con¬ 
vergence  (sometimes),  decreasing  accuracy  after  the  first  solution,  and  imperfect  parallel  renormalization 
(leading  to  poor  use  of  svstem  dynamic  range  and  hence  poor  accuracy)  We  introduce  new  approaches  and 
algorithms  to  solve  these  problems.  The  new  algorithms  lead  to  a  tight  error  bound  on  eigenvalues  and  an 
automatic  handling  of  degenerate  or  near  degenerate  eigenvalues.  Applications  are  discussed. 


I.  Introduction 

There  has  been  a  recent  increase  in  interest  in  using 
optics  to  perform  certain  simple  algebraic  operations1-* 
and  to  use  those  optical  operators  to  perform  iterative 
operations  solving  practical  problems. We  are  con¬ 
cerned  here  with  the  use  of  optical  algebraic  operations 
to  solve  eigenvector  problems.  Prior  work3  "  used  op¬ 
tical  vector-matrix  multiplication  to  carry  out  a  classical 
procedure  called  the  power  method.  We  will  review  the 
power  method  here,  indicate  the  three  major  problems 
from  which  it  suffers,  and  show  how  those  problems  can 
he  solved. 

Let  us  assume  that  we  have  a  full  rank  symmetric  .V 
X  .V  matrix  A.  We  know  that  A  has  .V  real  eigenvalues 
\ i .  \j, ....  X.v  and  .V  eigenvectors  ei,  •  •  •  .  e.v  satis¬ 
fying 

e„  •  e,  =  1 1) 

Futhermore  ei,  e;,  . .  . ,  e.v  span  the  allowable  vector 
space.  Thus  an  arbitrary  vector  V„  can  be  written 


V„  =  <nei  *■  + . +  a.ve.v.  i2) 

where  a  j,  a. a  v  are  scalers. 

Let  us  write 

V,  =  AV„.  i3> 


Applying  A  to  successive  Vy  values,  we  obtain 
V,  =  AV,  =  A  -Vo, 

'•II 

V,,  =  AV.,.,  -  A?V, 

binoe 
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Apa*e»  *  2»Ape*  *  a*A*pe*.  io) 

therefore, 

Vp  *  a,A,8ei  +  u.Xv>e2 +.  .-rav.\,spev  ini 

If 

I  A,  |  >  |  A^  j  o  i 

for  m  ^  l,  the  1 th  eigenvector  will  (above  some  number 
of  iterations  p)  come  to  dominate  VP.  so  for  p  suffi¬ 
ciently  large 

V,  2:  aoVe,  i9i 

Of  course,  we  recognize  this  condition  by  the  fact 
that 


V?aA,V0.,  ,10» 

We  can  now  discuss  the  problems  with  this  method. 
First,  the  convergence  can  be  very  slow.  If  we  require 
P  =  10H.  even  an  optical  processor  is  slow.  The  second 
problem  relates  to  deflation,  that  is.  finding  the  smaller 
|X,|  values  and  the  corresponding  e*  values.  While 
there  are  many  deflation  methods,  most  lead  to  answers 
with  decreasing  accuracy.  The  primary  problem  is  that 
most  deflation  methods  assume  a  perfect  accuracy  of 
previously  calculated  results.  Thus  errors  tend  to  ac¬ 
cumulate,  and  very  significant  errors  can  occur  for  rel¬ 
atively  low  values  of  |  X*  | .  It  is  sometimes  true  that  we 
want  only  a  few  of  the  dominant  eigenvectors,  but  it 
would  be  unwise  to  accept  this  limitation  if  it  can  be 
avoided.  Third,  we  need  a  fully  parallel  way  to  deal 
with  the  normalization  problem.  Otherwise  we  lose  the 
advantages  of  parallel  optical  computation.  The 
renormalization  referred  to  is  a  necessity  forced  on  us 
by  the  fact  that  optics  uses  fixed  point  rather  than 
floating  point  calculations.  The  vector  components  ot 
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Vp  may  be  either  very  large  t  it  \  A*  |  >  1 )  or  very  small 
( if  |  \fc !  <  1).  Thus  we  renormalize  after  each  iteration. 
To  renormalize  we  must  estimate  the  maximum  com¬ 
ponent  and  set  the  input  so  that  the  maximum  output 
value  is  large  but  not  beyond  the  range  of  our  optical 
computer.  How  do  we  estimate  that  component1  How 
cati  we  check  for  saturated  components  without  looking 
at  the  components  sequentially  and  thus  slowing  down 
operations'1  Besides  these  major  problems  there  are 
also  unanswered  questions  on  how  to  handle  degenerate 
solutions  and  how  to  estimate  accuracy  etc. 

Having  introduced  the  problems  with  prior  ap¬ 
proaches,  we  move  to  a  discussion  of  possible  solutions 
to  those  problems. 

II.  Convergence  Problem 

By  reformulating  the  power  method,  we  can  intro¬ 
duce  considerably  more  parallelism  in  each  iteration 
and  thus  reduce  the  number  of  iterations  dramatically. 
For  example,  a  problem  which  would  have  required  lO*5 
iterations  by  the  prior  method  will  now  require  only  20 
iterations.  In  general  K  iterations  with  the  new  power 
method  is  equivalent  to  2K  iterations  of  the  prior 
method.  We  achieve  this  by  using  the  matrix  squaring 
method.-  We  briefly  explain  the  method  as  well  as  add 
our  own  observations  concerning  the  advantages  of  the 
matrix  squaring  algorithm  over  the  power  method  just 
described.  The  reader  will  note  that  matrix  squaring 
is  itself  a  power  method,  but  it  operates  on  the  given 
matrix  itself  rather  than  operating  on  a  vector  while 
leaving  the  matrix  uncharged. 

The  matrix  squaring  method  for  eigenvalue  eigen¬ 
vector  analvsis  is  based  on  the  spectral  representation 
of  a  symmetric  matrix: 

A  =  K.VK  ",  < ;  1 1 

where  A  is  a  diagonal  matrix  containing  the  eigenvalues 
of  A  and  E  is  an  orthogonal  matrix  whose  columns  are 
the  eigenvectors  i  t  A.  That  is.  'he  6th  column  of  E  is 
the  eigenvector  e*  associated  with  rhe  eigenvalue  A*. 
The  orthogonality  of  eigenvectors  of  svmmetric  matri¬ 
ces  is  expressed  in  maim  form  as 

KF.  r  »  rk  *  ;  1 12) 

We  use  ‘his  property  to  express  powers  of  A.  Writing 
A  "  as  1  E.\E  '  1  'HAE  !.  ,  1  EAEr>  with  n  factors, 

v  -  F.  \  K  "  :  t  :vi 

Performing  'he  ma tr  x  m  lltmlications.  A"  can  be  ex¬ 
pressed  IS 

‘  \  e  e  14) 

where  S  ; -  'he  dime*  of  A. 

h  r  i  r-nvefi'er-ce  we  a;  .  as-cme  tb.-q  eigenvalues  are 
.  T-lerei ; 

!  or  ’  he  ■.  o  -e  :  .’ml  .  >  i  i  jegener  »t**  "i  gen  value.  tor 
'Ufticiell'-.v  'arge  o , 
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Each  ■  oiumn  of  A'1  is  proportional  to  the  normalized 
eigenvector  ej,  each  row  is  proportional  to  the  transpose 
el.  1  o  obtain  the  eigenvalue  A|,  any  column  of  A"  can 
be  multiplied  bv  A.  The  power  to  which  A  must  be 
raised  depends  on  the  dominance  of  A].  The  conver¬ 
gence  is  of  the  order  of  lA-cAti'1.  If  n  is  sufficiently 
large  he  rank  of  A  ''  is  one,  and  each  column  can  be 
normalized  to  ei.  This  operation  forms  a  test  to  ensure 
that  toe  eigenvalue  is  nondegenerate.  Since  the  spec¬ 
tral  decomposition  of  A  contains  contributions  from  all 
its  eigenvectors,  all  the  dominant  eigenvectors  are 
contained  in  Ah  Thus -the  rank  of  A-1  is  the  degeneracy 
of  the  dominant  eigenvalue.  For  a  degenerate  case,  say 
degeneracy  two,  where  two  eigenvectors  have  the  same 
eigenvalue.  A”-  can  be  approximated  hv 

A”  -  A^ieie ‘  +  <17> 

where  ej  and  e?  are  orthogonal  but  are  associated  with 
A..  E  ich  column  of  \n  is  a  linear  combination  of  ej  and 
eo  ano  hence  is  an  eigenvector  of  A.  However,  on  nor¬ 
malization,  the  columns  of  A'1  will  not  be  identical.  The 
rank  <  f  A'1  is  equal  to  two,  the  degeneracy  of  A,.  Any 
two  linearly  independent  columns  of  Ar-  can  be  used  to 
obtain  two  orthogonal  eigenvectors  of  A.  The  clear 
advantage  of  the  matrix  squaring  method  is  that  ail  the 
degenerate  eigenvectors  of  an  eigenvalue  can  be  ob¬ 
tained  at  once  because  no  mechanism  favors  one  over 
the  otaers. 

By  actually  forming  An.  a  useful  error  bound  for  the 
magnitude  of  the  dominant  eigenvalue  can  be  obtained. 
The  bounds  can  be  derived  as  follows.  If  we  raise  A  to 
an  even  power,  n  =  2m.  all  the  eigenvalues  of  A"  are 
positive.  Its  dominant  eigenvalue  A'J  will  be  smaller 
than  i  ts  trace,  which  is  equal  to  the  sum  of  all  its  eigen¬ 
values. 

.V  V 

7>A  =  1  A"  =  Z  A’. 

Here  .V  is  the  dimension  of  A.  On  the  other  hand,  the 
dimension  times  the  dominant  eigenvalue  is  larger  than 
the  trace.  Therefore. 

a i  <  7V.V  <  .v,vj  i lei 

Rearranging  this  and  taking  the  nth  root  yield 

/ip  » 

[— |  rr-.vi1  -  <■  i \,|  <i7>a"i,'’>  ii9) 

For  n  sufficiently  large,  (l/:V)I  n  approaches  one  to 
within  the  precision  of  the  processor,  A  good  estimate 
of  !  Ai|  is  the  mean  of  the  upper  and  lower  bounds, 


f  •  ]  'i  I  nl 

t/>  ~  '  1  -  !  ~!  !  I  Tr  A  'M 5  *''! 1 .  <21  I 

,  !.v  | 

(t  should  be  noted  that,  the  matrix  squaring  method 
raises  A  to  an  even  power,  and  hence  we  are  finding  ei¬ 
genvectors  and  eigenvalues  of  A-  rather  than  A.  The 
eigenvectors  of  A-  will  he  identical  to  eigenvectors  •>(  A 
except  c’hec.i-*  -.vlii-re  A  has  I  wn  nml  -.  wtn<  t>  -  it  i  I  . 


V.-  ) 


A,  =  -A,.  In  this  case  A-  has  a  doubly  degenerate  ei¬ 
genvalue  At.  Only  two  particular  linear  combinations 
of  the  degenerate  eigenvectors  of  A-  will  be  eigenvectors 
of  A.  When  a  degeneracy  or  an  apparent  degeneracy 
occurs,  a  new  eigenvalue  eigenvector  problem  must  be 
solved.  We  use  the  orthogonalized  linear  independent 
columns  V’*,  of  A"  to  form  a  new  matrix  G  given  by 

G„  =  V^AV,  (22) 

The  dimension  of  G  is  of  the  order  of  the  apparent  de¬ 
generacy.  The  eigenvectors  of  G  yield  the  linear  com¬ 
binations  of  the  V’s,  which  are  eigenvectors  of  A.  For 
a  true  degeneracy  G  is  already  diagonal. 

If  we  accomplish  matrix-matrix  multiplication  by 
sequential  matrix-vector  multiplications  using  the 
columns  of  the  matrix  as  vectors,  we  require  N  matrix 
vector  multiplications  to  accomplish  one  matrix 
squaring.  If  convergence  requires  M  squarings,  a  total 
of  .Vf.V  matrix-vector  cycles  will  be  needed.  Accom¬ 
plishing  the  raising  of  A  to  the  same  power  by  the  prior 
method  would  require  2Vf  matrix- vector  multiplica¬ 
tions.  For  slowly  converging  systems 

2m  >>>  l,  (23) 

while  M  and  N  may  be  relatively  small.  For  example, 
if  M  =  20  and  .V  =  50.  we  would  need  20  matrix-matrix 
multiplications  by  matrix  squaring  or  1000  matrix- 
vector  multiplications,  whereas  10"  matrix-vector 
multiplications  would  be  required  by  the  prior  power 
method.  Clearly  the  convergence  is  improved  dra¬ 
matically  by  matrix  squaring  even  if  the  hardware  is 
restricted  to  vector-matrix  multipliers. 

III.  Deflation 

Deflation  remains  a  vexing  problem  in  that  it  tends 
to  lead  to  decreasing  accuracy  in  subsequent  eigenso- 
iutions.  This  problem  is  magnified  when  only  low- 
precision  is  available.  While  we  have  arrived  at  no  final 
solutions  to  the  problem,  we  suggest  two  methods  which 
may  prove  fruitful.  The  main  problem  is  that  one  finds 
i>nlv  approximate  eigenvalues  A)  and  approximate  ei¬ 
genvectors  ei  rather  than  the  exact  quantities.  We  seek 
methods  which  will  be  adaptable  to  the  matrix  squaring 
approach  and  for  which  the  errors  do  not  accumulate 
as  successive  eigenvalues  and  eigenvectors  are  found. 
The  latter  restriction  is  the  most  important  for  pro¬ 
cessing  with  low  precision.  Common  approaches  which 
can  be  incorporated  inti)  the  matrix  squaring  method 
include  deflation  by  subtraction  and  deflation  by  or- 
thogonaiization.  Perhaps  the  most  obvious  technique 
:s  deflation  hy  subtraction  in  which  a  new  matrix  to  use 
[or  the  power  method  is  generated  from  A  by  subtract¬ 
ing  A  e  e  r  from  A.  This  approach  was  first  suggested 
bv  Hoteiling.  However,  in  practice,  errors  in  both  the 
estimated  eigenvalue  and  eigenvector  can  lead  to  nu¬ 
merical  errors  when  the  power  method  is  applied  to  the 
deflated  matrix  to  obtain  A:. '  For  these  reasons,  the 
method  .-mould  be  used  only  m  formal  analysis. 

The  deflation  bv  orthogonaii/atmn  method  addresses 
these  difficulties  by  choosing  a  rial  vector  V  for  'he 
power  method,  which  .  -  >rthngonai  to  ej  However. 


since  we  only  know  ei  and  Ai  approximately,  the  or- 
thogonalization  is  only  approximate,  and  the  true  ei 
component  grows  and  may  become  dominant  again.11 
A  wise  procedure  is  to  reorthogonalize  the  current  vector 
to  previously  found  eigenvectors  from  time  to  time.  A 
useful  way  to  perform  the  orthogonalization  in  a  trial 
vector  V  is  to  use  the  annihilation  operation  (A  — 
Ai/), 

V1  =  (A  -  A,/)V.  (241 

Orthogonalizing  in  this  way  has  the  advantage  of  re¬ 
moving  explicit  error  contributions  due  to  errors  in  the 
eigenvector.  Only  the  error  in  the  eigenvalue  estimate 
contributes  to  the  growth  of  the  unwanted  component 
in  the  power  method.  This  approach  can  be  incorpo¬ 
rated  into  the  matrix  times  matrix  approach  by  forming 
the  product  Am(A  -  A \I)k,  where  we  have  multiplied 
the  starting  vector  with  A  a  total  of  m  times  and  reor- 
thogonalized  k  times. 

Under  the  conditions  of  low  precision  the  best  pro¬ 
cedure  may  be  to  reorthogonalize  at  each  step.  Then 
k  =  ■u,  and  the  method  is  equivalent  to  finding,  the 
principal  eigenvector  of  the  matrix  A  =  A(A  -  A^). 
Error  analysis  shows  that  the  Ai  component  contami¬ 
nates  the  A 2  as 


where  (5  is  the  error  in  our  estimate  of  \lt  i.e.,  6  =  Ai  - 
Aj.  This  procedure  is  safe,  and  the  power  method  can 
be  made  to  converge  to  each  eigenvector  in  turn.  The 
accuracy  is  limited  by  the  accuracy  of  the  previously 
estimated  eigenvalues.  The  errors  in  estimates  of  ei¬ 
genvalues  must  remain  small  compared  with  all  the  ei¬ 
genvalues  sought  and  to  differences  between  eigenval¬ 
ues  sought.  For  the  later  eigenvectors  the  method  be¬ 
comes  cumbersome,  but  as  long  as  the  magnitude  of  the 
eigenvalue  sought  is  larger  than  the  largest  error  in  a 
previously  estimated  eigenvalue  the  method  will  con¬ 
verge. 

A  little  known  method  for  finding  all  the  eigenvalues 
and  eigenvectors  involves  double  shifting.  1--1:1  It  has 
the  advantage  that  one  starts  fresh  at  each  time,  and 
thus  no  accumulation  of  errors  results.  It  also  is  no 
more  cumbersome  as  more  eigenvalues  are  found.  At 
no  stage  is  the  knowledge  of  eigenvalues  to  high  preci¬ 
sion  required.  It  is  based  on  forming  a  family  of  ma¬ 
trices 

V(u./0  =  'A  —  u/l-‘  -  H-l  1 26 1 

for  use  with  the  power  method.  The  eigenvectors  of  A 
are  eigenvectors  of  Q.  Q  has  eigenvalues 

■V,  =  b-  -  R:.  (27i 

where 

*>  =  \  -  M 

The  strategy  is  as  follows.  The  b;  are  all  positive. 
The  smallest  one  is  the  one  tor  which  u  is  closest  to  th*3 
eigenvalue  A  B-  is  chosen  so  that  the  most  negati 
V  .  the  one  associated  with  the  smallest  b  ,  is  the  dom- 
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inant  root.  A  sufficient  condition  is  to  choose  B  so  that 
all  the  qL  are  negative.  Then  the  cj,  associated  with  the 
smallest  br  will  be  the  most  negative  and  hence  the 
dominant  root.  The  approach  is  to  applv  the  matrix 
squaring  method  to  the  family  of  matrices  Q[nJ3)  until 
all  the  eigenvectors  and  eigenvalues  of  A  are  obtained. 
If  the  power  method  is  first  applied  to  A  to  obtain  \.t  and 
ej,  a  safe  value  of  B  is  any  number  larger  than  B  >  j  Aj  j 
+  |/u|-  Here  m  can  be  our  best  guess  as  to  the  next  ei¬ 
genvalue  of  interest.  The  convergence  of  the  method 
to  a  solution  depends  on  the  two  eigenvalues  of  A  which 
are  closest  to  m-  If  X,  is  closest  and  X;  is  next  closest, 
that  is,  if 


i  X,  -  .a I  <  ( -  ft )  <  t  X*  -  m  I  for  all 

Qm  converges  to  q?e,el'  as 


k  *  i 
k  *  j. 


(29) 


[(X,  —  Ml2- 
[(X,  -mi2-  B‘ 


(30) 


It  is  clear  that  only  good  choices  for  m  and  B  are  re¬ 
quired:  no  precise  values  are  needed.  However,  the  rate 
of  convergence  can  be  slowed  by  excessively  large 
choices  of  B  or  a  choice  of  m  for  which  ( X ,  -  n )  =»  ( X,  - 


n ).  The  method  is  no  more  cumbersome  for  small  roots 


than  for  large  roots.  The  rate  of  convergence  will  be 
slower,  however,  if  there  are  several  small  roots  which 


are  close  together.  Under  those  conditions  qj/q,  will 
be  close  to  unity  for  any  choice  of  a*.  Precision  will  limit 
the  dynamic  range  of  eigenvalues  that  can  be  found. 
The  magnitude  of  B  must  be  such  that  qj/ql  is  less  than 
one  for  convergence.  Both  deflation  by  orthogonaii- 
zation  and  deflation  by  double  shifting  are  attractive 
approaches  for  obtaining  subsequent  eigenvectors  and 
eigenvalues  of  a  matrix.  Both  are  easily  incorporated 
into  the  matrix  squaring  method. 


IV.  Role  of  Precision  in  Error  Analysis 

Important  considerations  in  the  application  of  the 
power  method  are  the  limits  placed  on  the  method  by 
the  precision  of  the  computer. 

These  limits  are  based  on  the  precision  to  which  we 
can  obtain  the  eigenvalue  of  largest  magnitude.  De¬ 
flation  techniques  based  on  orthogonality  will  not  find 
eigenvectors  for  eigenvalues  which  are  smaller  than  the 
er.  jt  in  any  proceeding  eigenvalue.  Assuming  that 
errors  arise  only  because  of  precision,  the  largest  error 
will  be  associated  with  the  principal  eigenvalue.  For 
example,  if  the  precision  is  such  that  only  »•  decimal 
figures  are  significant,  the  error  associated  with  X]  is 
approximately  X;  x  10-<. 

Therefore,  the  smallest  eigenvalue  of  A  that  can  be 
found  X,m  satisfies 


This  can  be  shown  directly  by  substituting  o  =  X  10-" 
into  the  convergence  factor  of  Eq.  (27),  which  must  be 
less  than  unity  It  is  also  not  possible  to  distinguish 
between  true  degenerates  and  near  degeneracies  if  two 
or  more  eigenvalues  diffe  r  by  less  fhan  the  error  in  the 
principal  eigenvalue. 
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While  the  double-shift  method  does  not  require  ac¬ 
curate  values  of  previously  obtained  eigenvalues,  there 
are  direct  effects  of  precision  on  the  ability  of  the  ap¬ 
proach  to  resolve  near  degeneracies.  If  there  are  sig¬ 
nificant  figures,  the  convergence  factors  must  be  <1- 
L0-\ 

For  the  double-shift  method 


To  insure  that  the  most  negative  eigenvalue  is  the  most 
dominant,  B  must  have  the  same  magnitude  as  X|.  The 
best  choice  of  n  is  X,,  and,  therefore,  the  best  possible 
convergence  factor  for  the  double-shift  method  is 


1  -  10-*  >  1  - 


(331 


where  we  have  substituted  n  -  X,,  Xj  =  B,  and  rear¬ 


ranged. 

In  this  method  eigenvalue  pairs  whose  square  dif¬ 
ference  satisfy  (X,  —  X,)2  <  Xp  X  10-i  will  appear  to  be 
degenerate. 

-Another  practical  consideration  is  the  power  to  which 
a  matrix  should  be  raised  to  obtain  an  eigenvalue  esti¬ 
mate  that  is  consistent  with  the  number  of  significant 
figures  of  precision.  An  upper  bound  on  the  power  to 
which  a  matrix  can  be  raised  to  obtain  meaningful  re¬ 


sults  can  be  found  by  considering  the  bounds  on  the 
eigenvalue  obtained  from  the  trace.  The  error  is  given 


by 


1  -  ll/V)1'p 

5  =  i — iTrAf’)111'  =  (7>A/’)1/'T0->.  (34) 

2 

Dividing  Eq.  (34)  by  (TrAp)l/p  and  solving  for  P,  using 
the  approximation  ln(  1  +  x )  *  x  for  small  x.  yield  P  = 
(10s  la/V)/2,  where  N  is  the  dimension  of  the  matrix. 
This  assumes  s  is  the  number  of  significant  decimal 
figures.  P  represents  an  upper  bound  to  the  power  to 
which  the  matrix  should  be  raised.  For  s  =  2  and  N  = 
50,  we  have  P  =  185. 


V.  Renormalization 

The  renormalization  problem  may  become  very  im¬ 
portant.  The  i ,;  term  of  A2  is 

.V 

(a*),/  =  E  a,»a*;.  (351 

1 

A  very  conservative  approach  is  to  note  that  the  maxi¬ 
mum  possible  la2),J  is  .V  times  the  square  of  the  maxi¬ 
mum  a,j.  The  trouble  is  that  this  approach  is  so  con¬ 
servative  that  it  is  likely  to  make  very  poor  use  of  the 
available  dynamic  range  of  the  optical  processor  and 
erode  the  accuracy  of  results  in  a  system  which  already 
has  limited  accuracy.  By  doing  each  iteration  twice 
(doubling  an  extremely  short  processing  time),  we  can 
do  much  better.  We  use  the  ultraconservative  but 
simple  approach  just  described  to  normalize  the  inputs 
to  estimate  the  maximum  component  of  A2  from  A. 
With  the  estimated  A2  we  do  far  less  conservative 
renormalization  and  thus  preserve  accuracy 

Thus  we  must  search  both  the  accurately  calculated 
A  and  the  crudely  calculated  A2  for  their  maximum 


E-5 


components.  Remembering  that  in  optical  processors 
we  work  only  with  non-negative  components  which  we 
can  call  we  seek  a  parallel  way  to  search  for  max- 
(6,  j ).  The  search  need  not  occur  on  all  N  '  components 
in  parallel  if  (as  often  happens)  the  processor  does  not 
produce  them  that  way.  In  a  systolic  processor,  for 
example,  as  many  as  N  components  are  available  at  any 
instant.  We  can  find  the  maximum  among  them, 
compare  with  the  prior  maximum,  and  pass  the  larger 
value.  In  this  way  we  can  minimize  memory  require¬ 
ments  while  achieving  enough  parallelism  to  avoid 
slowing  down  the  process  substantially. 

A  parallel  search  can  be  made  by  subtracting  in  par¬ 
allel  from  all  available  component  signals  (<$,;)  a  ramped 
signal 

S(t)~Snt/T,  (36) 

where  So  is  the  maximum  allowable  signal  (a  physical 
constraint)  and  r  is  a  preselected  time  constant.  We 
then  detect 

=  b,,  -  S( t)  ( 37) 

in  parallel  for  all  ij.  Each  time  a  dt/  goes  to  zero  its 
detector  sends  a  unit  signal  to  a  counter.  When  the 
total  count  reaches  2 N2,  we  note  the  time  tQ.  Then 

max(6i;)  =  O’oto/r  (38) 


VI.  Applications 

Applications  of  eigenanalysis  to  direction  finding, 
bandwidth  compression  (Karhunen-Loueve),  pattern 
recognition,  etc.  are  familiar.  Here  we  want  to  point  out 
that  some  nonobvious  applications  may  prove  quite 
useful  as  well. 

Eigenvalue  determination  is  one  approach  for  finding 
roots  of  a  polynomial: 

P(xi  =  aoX”  =  ai.V-1'1  -r  +  a„  =  0.  (41) 

It  is  convenient  to  write 

PlX)  -aoiX’  ebi.Y'-1  +  (-6,),  (42) 

where,  of  course, 

b„  =  a*,ar>.  (43) 

We  can  then  write  a  matrix 

o  l  o 

O  II  1 

0  0  i) 

~br\ 

so  that  the  eigenvalues  -\  of  C  are  the  roots  of  P(X). 
The  eigenvalues  must  satisfy 

detiC  -  Mi  =  0.  i45i 

but 

detiC  -  A/i  =  i  - 1  PPi  Ai  ,i..  1 46 1 

The  form  of  C  is  easiest  to  see  for  a  low-order  polyno¬ 
mial.  Thus  for  n  =  4. 


~b  4  ■— frq  62  -6 1 


In  this  case 


det(C  -  A J)  *  det 


=  P(A)/a0.  '48) 


L-d<  3  ~bi  -f>ij 

By  our  method  we  can  easily  find  the  root  nearest  a 
chosen  value.  Likewise  multiple  roots  are  readily  de¬ 
tected. 

Of  course,  if  we  can  solve  P(X)  =  0,  we  can  solve 

P,(X)  =  P2(X)r  (49) 


P.v(X)  =  P,(X)  -  P2(X)  (50) 

must  have  a  zero  when  PfiX)  =  PglX).  More  generally, 
to  solve 


P,(X)  =  P2(X)  =  . 

we  form  the  new  polynomial 


P.v(X)  =0, 


Q(X)  =  I  [P,(X)|- 


Clearly  Q(X)  can  be  zero  only  if  each  of  the  P,  (X)  is 


VII.  Summary 

Prior  proposals  for  optical  computation  of  eigenpairs 
have  encountered  major  problems  relating  to  slow 
convergence  of  the  iterative  algorithm,  lower  accuracy 
on  less  dominant  eigenpairs,  and  low  accuracy  from  poor 
renormalization.  This  paper  discusses  some  methods 
reducing  these  problems  considerably,  although  it 
cannot  be  said  to  have  finally  and  definitively  solved 
them.  The  convergence  speed  is  increased  dramatically 
by  the  matrix  squaring  approach.  The  deflation  ac¬ 
curacy  may  be  improved  by  the  matrix  reformulation 
methods  discussed.  Excellent  use  of  the  available  dy¬ 
namic  range  can  be  assured  for  a  factor  of  2  decrease  in 
overall  speed  using  the  technique  described. 

Problems  related  to  degeneracies  and  numerical  ac¬ 
curacy  have  also  been  attacked  here.  In  particular  we 
have  been  able  io  show  that  matrix  squaring  handles 
degeneracies  easily  and  automatically  and  that  tight 
simple  error  bounds  can  be  determined. 

What  we  have  dealt  with  are  algorithm  related 
problems.  Implementation  problems  are  also  numer¬ 
ous,  but  they  are  beyond  the  scope  of  this  paper.  There 
are  also  rather  fundamental  problems  relating  to  the 
numerical  accuracy  of  the  final  answers.  We  believe 
that  these  problems  can  be  solved  so  as  to  make  optical 
eigenfunction  solution  practical  and  attractive. 


This  work  was  performed  under  U.S.  Air  Force  con¬ 
tract  F19628-82-C -0068. 
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Abstract 

♦  *► 

Optical  approaches  to  solving  the  Ax  *  b  problem  have  suffered  from  four 
difficulties:  (1)  an  inability  to  handle  the  problem  for  nonsquare  A,  (2)  the 
necessity  of  insuring  convergence  for  nonsingular  A,  (3)  the  inability  to 
handle  a  singular  A,  and  (4)  Inaccuracies  due  to  an  ill  conditioned  A.  We 
show  that  these  problems  can  all  be  solved  or  mitigated  by  singular  value 
decomposition  (SVD).  An  accurate  approach  to  optical  SVD  is  shown. 

Introduction 

Optical  computing  has  drawn  much  attention  in  terms  of  both 
architecture  and  algorithms  in  the  last  few  years.  This  paper  aims  at  a 
thorough  discussion  of  optical  singular  value  ^decomposition  (SVD):  a  topic 
recently  treated  by  Kumar.10  We  will  show  why  SVD  is  not  only  particularly 
well  suited  for  optical  computation  but  also  particularly  useful  as  part  of 
optical  computing's  repetoire.  Our  emphasis  will  be  on  a  particular  type  of 
problem  represented  as 


F-2 


■  b, 


V 


where  A  is  a  known  m  x  n  (m  rows,  n  columns)  matrix,  x  as  an  n  dimensional 
unknown  vector,  and  b  is  an  m  dimensional  known  vector.  Our  task  is  to  find 
x.  When  m-n,  this  the  familiar  case  of  n  linear  equations  with  n  unknowns. 
It  is  solvable  in  principle  if  A  is  nonsingular.  When  m  >  n,  this  is  the 
equally  familiar  problem  of  optimum  curve  fitting  (usually  using  a  least 
squares  criterion) .SVD  has  numerious  other  applications  in  image  processing, 
antenna  field  calculation  and  pattern  recognition  which  have  been  discussed 
elsewhere. 

-►  ■> 

The  Ax  ■  b  problem  is  arguably  the  most  important  and  most  cotmnmon 

problem  in  computing.  A  large  fraction  of  all  of  the  computer  time  in  the 

world  is  used  in  solving  large  linear  programming  problems.  Linear 

programming  solutions  occur  in  two  parts:  the  solution  of  large  Ax  -  b 

problems  is  the  most  time  consuming  part,  the  other  part  is  some  bookkeeping 

called  the  simplex  algorithm.  The  authors  have' heard  expert  mathematicians 

argue  that  the  least  squares  problem  is  the  most  important  problem  in 

mathematics  in  terms  of  its  impact  on  the  world.  Such  a  claim  could  be 

supported  by  applications  ranging  from  statistics  to  phased  array  antennas. 

Control  theorists  and  many  others  are  fond  of  posing  sets  of  differential  in 
♦ 

equations  in  the  Ax  *  b  format.  The  number  of  applications  there  is  quite 
large. 

♦ 

Prior  optical  approaches  to  solving  Ax  ■  b  run  into  a  variety  of 
problems.  First,  they  are  limited  to  the  m=*n  case  and  thus  omit  many 
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important  cases.  Second,  each  of  the  iterative  methods  has  a  convergence 
condition  which  can  be  guaranteed  only  by  going  through  a  precalculation  which 
either  confirms  the  convergence  or  transforms  the  problem  to  assure 
convergence.  Third,  the  result  of  our  calculations  may  be  in  very  serious 
error  if  A  is  ill  conditioned.  This  problem  is  compounded  by  the  inaccuracy 
of  optical  computers  relative  to  their  electronic  counterparts.  All  of  these 
problems  combine  to  make  optical  solution  of  the  Ax  ■  b  problem  less 
make  optical  soution  of  the  Ax  ■  b  problem  less  attractive  than  electronic 
solution  for  many  problems  even  though  optics  has  well  known  advantages  in  m 
and  n  size,  speed,  computer  size,  and  power  consumption. 

In  the  balance  of  this  paper  we  will  argue  that  SVD  alleviates  all  of 
those  problems  for  Ax  -  b  solution.  Specifically:  (1)  it  allows  m  *  n  and 
gives  che  least  squares  solution  in  that  case;  (2)  it  can  be  made  to  converge 
even  when  A  is  singular;  and  (3)  it  can  offer  us  a  way  to  find  good  but 
Inexact  solutions  even  when  A  is  ill  conditioned. 

We  consider  here  solving  the  least  squares  problem  for  non  symmetric  non 
square  matrices  A.  In  particular  we  will  be  concerned  about  matrices  which 
may  be  less  than  full  rank,  and  which  may  be  ill  conditioned.  That  is,  if  the 
dimensions  of  A  are  m  x  n  with  m  >  n,  then  we  include  for  consideration 
matrices  which  have  rank  k  <  n  and  have  pseudo  rank  l  <  k.  A  natural  approach 
to  sucn  problems  is  through  the  singular  value  decomposition  of  A.  A  can  be 
expressed  as 


A  3 


W  A  V 


where  W  is  a  m  x  m  orthogonal  matrix  and  V  is  an  n  x  n  orthogonal  matrix.  A 
is  a  m  x  n  matrix  whose  only  non  zero  elements  are  the  "diagonals'-,  A  if,  for 
i  -  l,k  where  k  is  the  rank  of  A.  The  singular  values  A ^  are  assumed  to  be 
in  descending  order  A 1  _>  X2  •  ••>.  ^k*  have  dropped  the  second,  redundant 
index.  Performing  the  implied  matrix  multiplications  yields 


A 


k 

V 

L 

i 


(2) 


We  use  the  lower  case  letters  w  and  v  to  indicate  column  vectors  of  W  and  V 
respectively.  The  subscript  i  indicates  the  column  number.  If  the  matrix  A 
has  a  pseudo  rank  of  i  <  k,  then  the  singular  values  A^+i  to  A^  will  be 
very  small.  The  Eckart  Young  Theorem11  suggests  that  the  last  k -l  outer 
products  can  be  deleted  from  the  sum.  That  Is  A  can  be  written  as 


l  l 

A  -  A  +  AA 


(3) 


where 


l 

A 


l 

c 


I- 1 


*  T 


v 


i 


and 


F-5 


E« 


i 


« 


A  A 


k 


I  X 

i-Z+1  ' 


w. 


V 


T 
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Eckart  and  Young  showed  that  A^  is  the  best  rank  l  approximation  to  A  in  the 
Frobenius  norm.  The  norm  of  the  error  term 


flAAl  2  -  l  A2 
F  i«£+l  1 

is  given  by  the  square  root  of  the  sum  of  the  squares  of  the  neglected 
singular  values.  If  the  elements  of  the  matrix  A  were  obtained  experimentally 
or  if  they  are  stored  in  a  computer  with  low  precision,  such  that  the  stored 
version  differs  from  the  "true''  version  by  <5A  then  carrying  more  singular 
values  than  that  number,  l,  for  which  lAA^I  ■  I15AJI  is  useless. 

For  numerical  stability  we  replace  A  with  A2 .  In  matrix  form  we  write 


/  -  /  A*  (VV 


(A) 


where  Vr  is  the  m  x  l  matrix  whose  columns  are  the  first  l  columns  of  W, 
is  the  n  x  l  matrix  whose  columns  are  the  first  l  columns  of  V  and  A* 
is  the  l  x  l  matrix  of  singular  values  Aj,i*l,£. 

The  least  squares  problem  Ax  »  b  is  transformed  into  a  new  one  by 
multiplying  on  the  left  by  and  using  the  fact  that  W^W»1. 
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r< 


< 
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% 

P. 


T  +  T+  T+ 

W  Ax  “  AV  x  ”  W  b: 


(5) 


Defining  y  *  V^x  and  g  «  W^b  the  least  squares  problem  is 


Ay  -  g 


The  components  of  y  are  given  by 


(6) 


yi  "  r: 


i  -  l,k 
i  ■  k+1  ,n 


(7) 


The  solution  vector  x  is  obtained  from  Vy.  The  norm  of  x  is  a  measure  of  the 
stability  of  the  least  squares  solution.  It  is  obtained  from  the  square  root 
of 


»xJ  2  -  8y»  2  -  l  (t-M 

i  Ai 


The  square  of  the  norm  of  the  residuals  is  given  by 


(8) 


R2  *  i!  Ax  -  bS  2  ■  1  g2 

i-'ic+l 


(9) 
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In  the  evenc  Chat  A  is  ill  conditioned  some  of  the  columns  of  A  are 


nearly  linearly  dependent,  and  some  of  the  singular  values  will  be  small. 
Contributions  from  the  small  singular  values  lead  to  erratic  changes  in  x  and 

in  a  dramatic  increase  in  its  norm.  Defining  a  pseudo  rank  of  1  less  than  k 

*0  l  *1  +1 

we  obtain  solutions  x*  for  the  least  squares  problem  A  x  »  b  defining  g 

as  (V4)  T  b  we  obtain 


yi 


i  -  1  ,1 


y 


i 


0 


i  ■  1+1 ,n 


(10) 


The  solution  vector  x4  is  obtained  from  V*y.  The  square  of  the  norm  of 
xhs 


*i  2 
ix  r 


(ll) 


and  the  square 


or  rue 


1  2 
ilR  1 


l+l  -*■  2 
IIA  X  -  b  a 


the  pseudo  rank  £  is  chosen  so  that  the  norms  of  the  solution  vector  5x4ll, 
the  residual  1 R ^ ii  and  the  error  matrix  HAA^a  are  exceptably  small.  More 


I 


details  of  this  aspect  of  least  squares  problems  can  be  found  in  Lawson  and 

1  2 

Hanson. 

When  the  pseudo  rank  l  is  much  less  than  n,  a  method  for  finding  only  the 
first  i  singular  values  and  the  residual  matrices  W^  and  is  desired. 

We  propose  obtaining  this  partial  singular  value  decomposition  of  A  by  use  of 
a  power  method.  An  iterative  scheme  can  be  based  on  the  following  pair  of 
equations . 


A  v. 


XiWi 


and 


T- 
A  w. 


Xi  Vi 


(13) 


(14) 


which  are  obtained  from  Eq.  (2).  Starting  with  an  initial  guess  at  v^  namely 
v°  and  an  initial  w° ,  and  an  estimate  of  the  singular  value,  ,  can  be 
obtained  from  Eq.  (13).  w°  in  turn  can  be  used  in  Eq.  (14)  to  find  an 
improved  vj .  We  use  superscripts  to  indicate  iteration  numbers. 

After  J  iterations  we  have 


(16) 


X1W1 


A  v, 


+  J  +J-1  +J  ♦J-l 

The  procedure  can  be  stopped  when  5vj  -  v  1  and  Iwj  -  II  are  sufficiently 
small.  This  procedure  will  yield  the  dominant  singular  value  X^  and  singular 


vectors  w^  and  vi 


Applying  the  procedure  to  the  deflated  matrix 


■+•  -*■  x 

A  -  A  -  \l  wL  vL 


(17) 


will  yield  X2 ,  W2  and  v2  and  so  on.  This  approach 
by  Shlien13  and  by  Kumar.10  An  alternate  approach 
Eq.  (15)  and  (16)  are  substituted  into  one  another 


has  been  recently  proposed 
is  suggested  here.  If 
one  obtains 


2  -J  .  T  *J-1  +J-1 

X  v  -  (A  A)v  ■  S  v 


(18) 


and 


,  2  *J  / ,  .T.  ♦J-l  M  +J-1 

X  w  »  (AA  )  w  ■  M  w 


(19) 


This  approach  is  equivalent  to  finding  the  principal  eigenvectors  of  the  n  x  n 

and  m  x  m  positive  semidefinite  matrices  S  ■  A^A  and  M  *  AA^, 

♦ 

respectively.  The  right  singular  vectors,  v^  of  A  are  eigenvectors  of  S 
while  the  left  singular  vectors,  w^  are  eigenvectors  of  M.  The  non-zero 
eigenvalues  of  S  and  M  are  equal  to  the  square  of  the  corresponding  singular 
value,  \2. 
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It  is  not  necessary  to  find  the  eigenvectors  of  both  S  and  M.  A  simple 
approach  is  to  find  the  eigenvectors  to  the  matrix  of  smallest  dimension, 
namely  S.  Several  approaches  to  the  use  of  the  power  method  for  eigenvectors 
of  symmetric  matrices  have  appeared  in  the  literature .6 » 14 » 15  Once  the 
first  eigenvector  v^  of  S  is  found,  wj  can  be  obtained  from  Eq.  (13).  The 
matrix  A  can  be  deflated  by  the  combined  use  of  Eq.  (17)  and  Eq.  (13). 

A  -  A  -  A  v1  vlT  (20) 

The  positive  semi  definite  matrix  A^A  can  be  formed  and  the  procedure 
repeated  to  find  V2 ,  *2  and  *1  and  90  on* 

One  concern  in  using  a  power  method  for  singular  value  decomposition  is 
the  loss  in  dynamic  range  that  occurs  when  ATA  or  AAT  is  formed.  As  Eqs. 

(18)  and  (19)  we  derived  from  (15)  and  (16),  the  formation  of  these  square 
matrices  results  from  any  formulation  of  the  power  method.  The  best  one  can 
do  is  to  initially  equilibriate  the  columns  of  A  and  normalize  the  approximate 
(singular  vectors)  eigenvectors  at  each  Iteration.  Equilibration  is  the 
process  of  finding  that  diagonal  matrix  D  which  will  scale  the  columns  of  A  so 
that  they  have  unit  length.  We  let 


V  -  “ 


4 

l 

> 

»  . 

S 

* 

;* 


4 


4 


and  solve 

(AD)  (D_1  x)  -  b 

We  assume  Chat  A  was  previously  equilibrated  in  the  above  discussion. 

The  key  to  numerical  stability  in  the  power  method  is  not  in  the  formation  of 
the  square  matrices  S  and  M.  The  key  is  that  deflation  be  performed  on  A  in 

order  to  find  additional  singular  values.  One  should  not  attempt  to  deflate  S 

1  3 

or  M.  The  success  of  our  proposed  method  as  well  that  the  methods  of  Shlien 
and  Kumar10  is  based  on  this  deflation. 

The  difficulties  that  we  address  in  terms  of  dynamic  range  can  be 
illustrated  by  the  following  example  matrix. 


A  • 


/I 

0 


(21) 


where  e  is  within 
of  its  columns  is 
matrix  A  has  rank 


■ 


the  dynamic  range  of  the  computer  and  e2  is  not.  The  norm 
(1  +  e2)1/2  -  1,  so  the  matrix  is  equilibrated.  The 
2  but  is  ill  conditioned.  The  matrices  S  and  M  that  will  be 
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generated  in  the  computer  will  be  the  rank  1  matrices  on  the  right  in  Eq.  (22) 
and  (23)  respectively.  The  key  point  here  is  information  about  vj  and  wj  are 
still  retained  in  S  and  M  while  information  about  v2  an<*  *2  are  lost. 

Numerical  instability  will  occur  when  attempting  to  deflate  S  or  M  to  find 
subsequent  eigenvectors.  Numerical  stability  is  maintained  only  if  A  is 
deflated  through  Eq.  (20).  That  is  the  power  method  will  find 


✓  2  1 


and  deflation  of  A  will  yield 


0  0 

A  -  j  -1  1 

1  -1 


v2  (1//2)  [l,  -l]T  will  be  found  by  applying  the  power  method  to  ATA.  The 
key  to  the  successful  use  of  the  power  method  for  singular  values  is  the  use 
of  the  deflation  of  A.  Attempts  to  deflate  S  or  M  will  yield  matrices  which 
contain  only  noise.  The  principal  eigenvectors  to  the  matrices  S  and  M  can  be 
obtained  from  the  power  method  however.  The  use  of  a  power  method  requires 


the  formation  of  at  least  one  of  these  matrices. 
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We  summarize  Che  proposed  procedure  for  singular  values  decomposition  as 


•s 


follows 


■j  (i)  Equilibrate  A,  call  it  Aj  . 


i 


T 

(ii)  Form  S*  -  A^  A^ ,  scale  if  necessary. 

(iii)  Find  the  principal  eigenvector  v^  of  S*. 


1 


(iv)  Calculate  A  v-^. 

(v)  Find  X^  by  normalizing  A 
if  X i  *  0  stop. 

(vi)  Wt  is  the  resulting  normalized 
A  Vi. 

+T 

(vii)  Form  A^+i  *  A^  -  A  v^, 
scale  if  necessary. 

(viii)  Go  to  ii. 

This  procedure  will  terminate  after  obtaining  the  2th  singular  value  X^ 
singular  vector  V£ .  The  least  square  problem  is  then  solved  using  Eqs 

(10),  (11)  and  (12). 


F-14 


and 


( 


References 


1.  J.W.  Goodman,  A.R.  Dias,  and  L.M.  Moody,  Optics  Lett.  2,  1  (1978). 

2.  H.J.  Caulfield,  W.T.  Rhodes,  M.J.  Foster,  and  S.  Hovitz,  Opt.  Common.  40, 

86  (1981). 

3.  D.  Psaltis,  D.  Casasent,  and  M.  Carlotto,  Optics  Lett.  _4,  348  (1979). 

4.  D.  Casasent,  J.  Jackson,  and  C.  Neumann,  Appl.  Opt.  22^,  115  (1983). 

5.  R.P.  Bocker,  H.J.  Caulfield,  and  K.  Bromley,  Appl.  Opt.  _22 :,  (1983). 

6.  H.J.  Caulfield,  D.  Dvore,  J.W.  Goodman,  and  W.H.  Rhodes,  Appl.  Optics  20 , 
2263  (1980). 

7.  M.  Carlotto  and  D.  Casasent,  Appl.  Opt.  21_,  147  (1982). 

8.  J.W.  Goodman  and  M.S.  Song,  Appl.  Optics  _21.,  502  (1982). 

9.  W.K.  Cheng  and  H.J.  Caulfield,  Opt.  Common.  43_,  251  (1982). 

10.  B.V.K.V.  Kumar,  Appl.  Opt.  23,  (1983). 

11.  C.  Eckart  and  G.  Young,  Psychrometrika  211  (1936). 

12.  Charles  L.  Lawson  and  Richard  J.  Hanson,  "Soluins  Least  Squares 
Problems,”  Prentice  Hall,  New  Jersey  (1974). 

13.  Seymor  Shlien,  IEEE  Trans.  PAMI  _4,  671  (1982). 

14.  B.V.K.V.  Kumar  and  D.  Casasent,  Applied  Optics  2_1_,  3707  (  1982). 

15.  J.  Gruninger  and  H.J.  Caulfield,  Applied  Optics  23  (1983). 


APPENDIX  G 

APPROXIMATE  SINGULAR  VALUE  DECCMPOSITION 

(This  work  was  presented  at  the  1983  OSA 
meeting  as  noted  using  the  attached  view- 
graphs.  The  write  up  for  publication  is 
still  being  pursued.  We  will  submit  the 
paper  for  publication  in  Applied  Optics). 
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Contributed  Papers 


M V 1  two- Dimensional  Optical  Fourier  Transformation  oy 
Time-Integration  Methods.  William  T  RHODES.  KEITH  3 
1U  EHLE.  AM)  ROBERT  E.  STROUD,  Scnooi  of  Electrical  Engineer  n*> . 
1  Institute  T*’chfn-.on\ ,  Atlanta,  Georgia  30332  —Time* 

integration  optical  processing  methods  in  the  past  have  been  applied 
■n  the  spectrum  analysis  <»f  one-dimensional  signal  waveforms. 
However.  they  can  also  be  used  to  evaluate  the  spatial -frequency 
content  of  two-dimensional  images  The  key  idea  is  a  mapping  of 
object  points  into  associated  two-dimensional  fringe  patterns.  A 
ivstem  using  a  modulated  laser  and  two  orthogonally  scanned  mirrors 
has  been  used  to  produce  a  time-integration  (complex)  spatial-fre- 
quency  spectrum  of  a  test  pattern.  Helpful  analogies  between  svstem 
operation  and  incoherent  holography  can  be  drawn.  *13  min.) 

MF2.  Cosinusoidal  Transforms  in  White  Light.*  SHEN-GE 
w  ANi.  aND  Nh  Ho|„as  GEORGE.  The  Institute  of  Optics.  L  nu  ersitv 

R^cnester.  R:»i'h»-'*ter .  \eu  York  1 4h2?  — Theory  and  techniques 
>r  white- light  .nter>rometrv  ire  bring  stud.ed  :n  order  to  develop 
->*  -T.eth.MiH  a  optica*  pattern  recognition  In  white-light  iilumi- 
•laD-u.  »r  with  rough  .nput  -meets.  the  conventional  diffraction- 
patfern  •sampling  -..stem  is  n*.-i  applicable  without  the  use  of  an  in- 
v 1  nerent-to-coherent  Oliver  er  A*  an  alternative  approach,  we  re¬ 
port  ,-n  i  I R  tract  ton -iuni  ted  transform  system  that  can  be  used  with 

•  pvctranv  or-iao.  ..patiullv  inc  merem.  illumination.  The  transform 
obtained  's  a  -patia.,  '  wo- dimensional  <  osine  'ransform  of  the  input 
plus  a  bias  term  The  optica,  svstem  c-  nsist*  of  a  double-imaging 
mferferorr.rier  *(tr.  a  oeam  spatter  a.  a  ?w«  r.gnc  angle  prisms  fol- 
:"weo  ’*.v  in  achromatic  optical -tran.-norm  lens  pair.  The  design  of 
’he  inter  ter, meter  uid  the  achromatic  optical  transform  are  detailed 
o’  i  a- T.f  r.iAteq  -wo :•  earlier  versions  Excellent  diffraction-limited 

•  ■•‘■“oi  mamv  -.hLumni  tor  rne  -*nare  visible  spectrum  m  an  all-glass 

-u  rn  \  cn.-ttMi  -de  arriv  ,s  aimed  in  the  optical  transform  plane 
.n  r  ler  mterfji  e  trie  -Astern  :>•  1  digital  computer  The  bias  term 

' i:r  •’Dsir.uMi'dfU  *ransb»m  . .  -untr acted  electronically  Noise 

•  nidations  are  described.  » \eune  :  ran.-* forms  Have  been  obtained  for 
t  v  .ir:— .  t  rougn  ./meets,  and  recent  experiments  are  described  i  id 
nun  - 

hi*  u  -.  j:  r  am>  hv  •  r.r  •  is  Mr  F-*rce  <  dfice  ot  Scientific  Re- 

-iO'n  mu  'h**  1  S  \rnv  Hfe*:-  r\  i  )‘hce 

MFT  Matrix  and  Image  Decomposition  by  Projection  F  l¬ 
ooding  .'iHN'rk  '  .Ni  ,K.<  v.NDH  >  \i.'FiKLj ).  Aeroavr.e  Resea- 1  h 
i*  \1:r 'ni'i*  .  .  :m d*’-;-  d  M a. -is- tenure 1 1 < !  1 .' *2!  — I’he  image 

>c<  mpo  -.  ."  pr-  mum  i>*j  oropi  * i».-n  **ucocLng  in  pro- 

1  ’■  -ii  -i.e.g  i  '  w<  ■  •.Jirnens;,:nj|  -irrav  •»!  data  is  collapsed  ir 


projected  onto  two  -r  more  one  dimensional  arrays  fhe  primarv 
operation  of  recunstructior.  is  the  oackprojection  -<i  the  une-dimen 
sionai  arrays  miu  twu-dimensionaj  space  if  the  matrix  is  column 
collapsed  and  row  collapsed,  the  hackprojection  correspond*  to  the 
vector  outer  prouuct  of  the  -ine -dimensional  arravs.  This  nac* .pro¬ 
jection  yields  a  rank-one  approximation  to  the  ordinal  matrix.  Re¬ 
peating  this  process  on  successive  resiauais  leads  to  a  decomposition 
of  the  image  into  a  sum  of  rank  -one  matrices.  The  projection  method 
can  be  considered  as  an  approximation  to  singular  value  decomposi¬ 
tion.  Lower  bounds  to  singular  values  are  obtained.  We  .^how  that 
the  method  is  computationally  simple  and  that  its  extension  u> 
three-dimensional  images  is  straightforward.  1 13  min.) 

MF4.  image  Processing  in  Signal-Dependent  Noise  (  sing 
Local  Statistics  and  the  Generalized  Homomorphic  Transfor¬ 
mation.  H  H  XRSENAL'LT  AND  M  ..EYESQl’E,  Department  ,.»f 
Physics,  LR<  >L  Vnicersue  Lacai ,  Ste-roy,  Quebec  G  i  K  Tp-l.  Can- 
aaa  —  In  an  image  with  signal-dependent  no*se.  i  general  point 
transformation  can  tran^turm  '.he  image  into  a  space  where  the  noise 
is  independent  ->i  the  signal.  Local  statistics  methods  of  image  pro¬ 
cessing  appropriate  to  auditive  noise  then  may  be  applied  to  the  image 
before  transtormmg  back  :o  the  original  space.  Experimental  results 
for  film  grain  noise  and  for  multiplicative  noise  are  shown  and  com 
pared  with  resuits  without  the  nomomorphic  transformation.  This 
method,  suitably  implemented,  is  about  10  times  faster  than  the  global 
method  using  the  fast- Fourier  transform.  (13  min.) 

MF5.  Minimum  Bias  Pupil  Design  for  Bipolar  Incoherent 
Spatial  Filtering.  JOSEPH  \  mait  and  w  T  RHODES,  .icnnol  of 
Electrical  Enmneer.n*; .  Georgia  Institute  of  Technology.  Atlanta, 
Oeur^u  30332  —  Incoherent  spatial  filtering  offers  a  number  ot  ad¬ 
vantages  over  coherent  rpatiai  filtering,  most  notablv  its  insensiimtv 
U>  dust  and  system  imperfecii./ns  However,  incoherent  systems  are 
i.near  in  intensity,  not  n  complex- wave  amplitude,  and  the  spatial 
impulse  response.  >r  point- spread  function,  is  nonnegative  and  real. 
The  removal  of  bias  by  means,  for  example,  of  image  subtraction, 
allows  for  the  syntheses  of  bipolar  point -spread  functions  and  ha*  been 
demi)nstrated  with  miutichannei  hybrid  eiectro-opncai  systems, • 
Bias  reduction  prior  r..  detection  can  increase  «:ontrast  and  decrease 
noise  m  the  final  processed  image  and  may  t>e  achieved  through  pup/, 
mask  design  Bias  rwiucf.on  tnr  -ugn  pupil  design  may  Re  presented 
as  a  problem  in  constrained  minimization  ;nus  ail-»wing  standard 
-optimization  techniques  to  be  used  to  rind  minimum  bias  pupils.  ■  \  '3 
mm..1 

■  VV  T  RhtxiP>  » )|>t  f.r\  19.  ■.  j ,  .Am-  ;  •m'- 
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EXAMPLE 


OPTICAL  COMPUTING: 

THE  COMING  REVOLUTION 
IN  OPTICAL  SIGNAL 
PROCESSING _ 

Development  is  progressing  toward  a  new  generation  of  optical 
computational  devices  that  may  provide  for  ultra-high-speed 
matrix  algebra  and  for  the  density  of  interconnections  needed  in 
optical  supercomputers. 

By  H.  John  Caulfield,  John  A.  Neff,  and 
William  T.  Rhodes 


A  multichannel.  systolic  acousto-optic  Ornery  convolver 
(SACBCt  .s  architecturally  configured  as  a  systolic -array 
processor  The  architecture  provices  a  high-speea 
means  of  metnx -vector  multiplications  using  the  digrtci 
multiplications  via  analog  convolution  algorithm.  This 
aigomnm  era  a  systcnc  acousto-optic  implementation 
penrn  t  the  speec  of  colics  to  Pe  combined  with  the 
accuracy  of  aigtoi  computation 


100 


NOVEMBER  fC, 


Optica!  \  ", •„ -r v.-::s  -  ?  >:  *•  '.he 

nineteenth  ■-o-  <  ’  ■  . u  ?.  •vle^’i,  Huy 

gens,  Abbe.  L:ot.'  ana  ?.r;C  and  its  g"eate--. 

promise  r.  the  -werty-first  utnr-iry  there  ir.  tie 
late  twentieth  century  h  -  number  of  ..'P-ca; 
processing  ?v  moms  1  sneer- mu  ana  v-e-s.  ?vr- 
thetic-uc*: -tv-  -  r-  ■.*  ■>  processors.  mb./ttv  Sanc¬ 
tion  gere-axors  * ■>:.  have  aire.mv  suppler  te-d 
their  electronic  'Mimt^rcart*  and  others  may 
succeed  soot;  •  pf.trerr  ret ngm  non  cjrectu  n  link¬ 
ing,  etch.  The  adv  •  -..ares  of  notxcs  iv’r  e’ectmn- 
ics  in  these  sv  stems  ..sclnde  some  ..ynb' nation  of 
lower  cost,  mewd  size  ’ .-w, r  ;•  ••  er  consump¬ 
tion.  higher  -m.  md  v-  ,t  cm  '7 y  rare  a.  reli¬ 
ability. 

Although  is  nt  vet  real  rat:  c  to  plan  for  a 
genera) •  purpose  optical  rr.-rrmter  >:  is  oocrib’v  to 
trunk  3?*. ou si /  abiuf  -la; rr-  .••:••’>’  ’  optical-array 
processors.  suggested  "■•••  Fig  1.  'sat  can  he 
used  at  a.‘vr>--~’  ■  re  yta  ;.nm;  ’  'er*  *  rr  nertorm- 
mg  spc':ruc  .rim  bis  zo  Tmntationn  a:  eery  high 
speeds  Pe*igr.s  in.  rumen,  v  under  ci.nsider.a- 
non  for  uruo-bigp-apesc;  cpticai  tncfusct!  to 
evaluate  poiync trials  matrix- vector  products, 
matrix-matrix  produc’s,  and  solutions  of  sets  of 
’■near  equation," 

This  article  rev!  v-’S  the  developments  of  the 
last  seve’al  dc code's  mat  led  to  this  posirion, 
describe*  b  ";eu  ■  some  ’.mportant  areas  of  current 
research  and  develop rae.at.  and  lists  several  areas 
of  expected  nuy  r.ftur-  development. 

Philosophy  ar.d  :ec-*nt  developmenH 

Operations  •le.'H-vnod  by  optica:  systems  or?  de¬ 
scribed  by  •untie  n.i:1,,--.:  u.cs;  convolution. 
-nu.it 1 ni’C” o".  -’tc  ;  requires  ml  >  a 
minor  chr.n<t*  v*.  uc. -;on*  t.  convert  from  mathe- 
m.u  •;:>  l-,  ?.  beserntmn  .nathe-oaucs  as  toe 
goal  of  t>c  •  ut.es.  due"*  a  v.  wpo:  u  was  taken  oy 
C  itrona  at  t>  :  vv«'«  t>  of  .vi  chip  an  as  early 

•is.  srim  -v  •  ,er  a-  •:c.»cr.ted  the  apt-uation  u 
optical  -vsV  :..  .be  •v-.ui  a*..:.-  •  g-neml  su- 
pernositior.  ••  "/  j  ■  tr.c  to  ’  »•  u  •cation  o? 

d.  v  ec\or  jv  .  t.  *>„■.  v  4  w  errir 

v  ;■ ?•  T:''j?5.rv  rv« 

Lug*.  C.  <  - . .  V.-'.T«r»  C-ooc- 

3lil.n- — .  ■  >;,r.  d, — •. ..:  «v-v-;: 

br  per'.'"’.'  .  .  a s  r: t  '*’i  urn 

tions.  These  ■■  nn-  t  h.-v  so'er  u-  .  •.. 

ccn-centr’.-yi  :  ■  ■  •  in  n  lus v,  '  . 

?n  'in'  »"c’.  t"  "o  ’  •  1 '  :•  .  an 

that  s.  ••  “■ 

er  or  c*  n  n  ■■  ■ 


"'‘n-me  the  nast  several  years  attention  has 
:•;-*) p<'  r->  «  d iterant  application  of  optics  to 
mathematical  operations,  in  this  case  operations 
that  art  r  .imer.cal,  sometimes  discrete,  and  often 
aigebraif.  ui  nature  ’ndeed,  the  redirection  of 
attention  ha?  been  so  vigorous  tnat  many  view  ;t 
as  a  snt’i  revo'oy -or,  m  optics  or.tica'  signa: 
processing  is  hegtriumg  to  encompass  what  many 
f'eel  is  aptly  de.cn bed  as  cp*ica!  computing. 
where  tine  norm  is  tally  intended  tc  imply  close 
comparison  with  the  operations  performed  bv 
sdtntU’c  digital  computers.  The  optical- array 
pr<~<s-.3or.  .tientionec  earlier,  forms  tne  basis  for 
this  re% o’uf.on.  fThe  term  optical  computing  has 
■>een  used  occasionally  for  Pearly  two  decades 
now  :n  'U  nr  tcti  m  with  analog  ontica i  processors, 
but  a  major  fraction  o-'  the  optical  signal-prcxiess • 
b:F  comm u m tv  has  or *••*?»•  felt  comfortable  with  •: 
because  of  the  in  oiled  comparison  with  genera '  • 
ourpo&e  digiml  computers.  That  situation  ; 
noised  ft  r  change.: 

Tn  retrospect,  the  beginning  of  modem  opticel- 
array  processor?  was  the  invention  of  what,  is  now 
often  called,  the  Stanford  optical  matrix-vector 
multiplier  'OMVM).  This  device,  illustrated  in 
Fig.  2.  has  a  capability  of  multiplying  a  2 0* 
component  vector  by  a  100  x  100  matrix  in 
roughly  20  ns.  Components  of  the  input  vector  x 
are  input  via  a  linear  array  of  LEDs  or  laser 
diodes.  Trie  light  from  each  source  is  spread  out 
horizontally  by  cylindrical  lenses,  optical  fibeis. 
cr  planer  lightguides  to  illuminate  a  two-dimen¬ 
sional  '2-D'  mask  that  represents  the  matrix  A. 
Light  from  t.'e  r.ask,  which  has  been  reduced  in 
intersity  by  verb  variations  in  the  mask  trans¬ 
mittance  mr.ction.  is  collected  column  by  column 
and  directed  to  discrete  horizontally  arrayed  de¬ 
tectors  The  outputs  from  these  detectors  repre¬ 
sent  the  enmnonents  of  output  vector  y,  where  y 
’?  g:ver,  bv  the  matrix-vector  product  y  =  Ax: 


*■'  "  u  -  ^  a.M-v!  1  -*V 

7  ••••“  ..gilt  •nt<»n?:tv,  wh.ch  is  always  nonnega- 
vo,  a  ...s^d  to  -  .or  -  «ent  the  varous  mathemr.t:- 
•r.  mis ;■  * : v ^  1:3!  ending  tech-upues  must  he 

•m-  ■  -  •  -  '  70  qr.rj  ovgp’ive  or  com- 

-•  r.  •  1  .tv  ir>-  '  n  i>+  » 7.  ■  ,rn  niud/i ! 

-  v  ,r  ■  v**d.  tiie  Stanfo  ti  ','MVM 

..  .  v.  t  ■  '  • »  -i>-it.-ai'v  v-iub  'imitations 


•  Accuracy  in.. v ••  .r*?;  .iccufdcy  vv*cn  *.vr. icr* 

iht'  SO'^rCtr  .  Cof.  be  COntTG.  >  cG  cU’.il  int 

OUtpu'  : G  wt.J Cied  ;ectd, 

•  tJyr,.aii*i£  rdri^tf  jc-u'v'v-  ir.u  or  detector  ..iitni- 

ec, 

•  Rapid  'ipca:iiug  cs  mv  matrix  A  requires  me 
use  oi  i  high*cuaii:>  2-J  rs&d- write  :.;*ar-s- 
purenc y- — a  ^pauim  . .. g a c  raoQ a.ai,or  ir>.LA*/ — 
wn0'fc  opCiCJ-i  'r  ci-  *d  ■?;  i  *  c  Cc;  pattern  c  a  a  oe 
c n an g t;ti  r a p » <1  * y  u  r*  i ■'•  rx  u  r. &.<■::  y ,  aucn  a  oe v * c e 
uGtj  nut  ve;  exist  vr.  .it .-..  the  dilute  characteris¬ 
tics,  although  canci’::.i:.r  aevie-ss  are  being  un¬ 
proven  rapidly. 

Despite  these  urav, .tucks,  the  Stanford  develop¬ 
ment  brought  about  an  important  swing  within 
the  optica  signa;  -  process- ng  community  from  a 
;;re<  ccupattor.  witn  c'-norent.  Fourier -transform- 

OciSeO  p r CO'.''; SO  Tl  1C  ,  Z.y  'j Cxr^r xr C t ,  g'/OmffZiiCSl  Op* 

tics-Od-h-a  ;r  Ktissor..  ’ :  ...i  m.-.  r  a  u>  note  rn at 

rh;?  f.;a  p-  :  •'.•••••*  c'.  *  -  ;\  a  j  *•  ; :: .  a  a  b  y  Pro  i . 

‘  V  ,  y.ct  ..  'Af'w'r  b  OK  Oii  Po'driSf 
> ;  ;*  ‘es/.  .  o  .  ,r ,».\s  *o  firmly  in 


•v  *  ■-  •  3  ^e  .  jj  I  *  v  1  .1  ).  Ovf*  — SOi  C».  U.'-i 

opt*;  JkiQ  -it  5‘>e  "Jd  -..i  *  \  v.r.a  auiuty  uO 

.  a  put  a  r.  cl  <_  u :  p  u :.  ;  . .  r.  ;•  ru*  n  re  q  ui  -  ed 

electronic  stem •;  .m*  .opr  ach  :.o  circuinvent- 

*r»g  ch:i  pnaoiom  us  to  tr.e  UM  v  at  'or 
digori'nmi.  a » u'- r •  t  t \ m*.  .-os*,  *  .  u  c  p  u  i  is  a i  ;ct.- 
ec  in  ana. eg  ft •r’r :  '  o  the  :jut.  A  variety  of 
iterative  or  ear  r.r.a  ...e>  .ft!  r  j  ••*  were  devei- 
viipe  u  :>v  • .  a'a.r; 

Ah •  -a t* s .  ».v.e  t\ , » i : . . •  ,  . . *. • , r. v •  . •;. .  t  .r  •.  e r :: i ; r.  :;•  f 

rn..r  .".a  ey.  4^t;  r*  *.  f  •■  .  ior  •■'•  oi  or 
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!*tiC  -  . p«  t.  ■•.  hki.o.e  WoV  <h .eciron..Ci  uoet- — oy  ^'vr.? 
aigiLa*  i  lies  ec.  to  t r. e  t.rst  suggesiiOn  tty  .Jor;iV.^ 

oi  u  iDtiiui  v-/  ac-iiew  aigttai  optiCwh.  ihi/'ci.  triC 

i? e V.' i ■.  enters  n*r  oi  h y e to i i c * <2 i*7" ay  proce.-.^aTit 

hf.ouiO  Ov  u men o.o ■  to  optica:  i in p a ence n ui i i o r* . 
. ^filler  a u •question  ied  co  ’aotk.,  prmtan.y  oy 
Vw-dv.^tiuic  iutiC.  on. oucc ,  or.  an  opt  can  sys’Ojnc- array 

pi  .'CcSaO;  ut:  vcit DeC  CcnO  V.  bOOH.  OOt/j  pUDJj.Sr.6C 

•-i/;.u  u  nee  v*orK  oy  iani-jra,  Casacciiv  ano 

omens  v.avar.cec  crus,  area  greaciy. 

Syscoiic  array  processing,  developed  prinopai- 
.y  by  H.  T  Kung  --.c  Carr.egie-Meiioc  Umversity 
anc  S  Y.  :\ung  at  the  university  of  Southern 
Caiiiorma,  is  an  algorithmic  and  architectural 
approach  to  overcoming  limitations  of  VLSI  elec¬ 
tronics  in  implementing  hign-speec  signal -pro¬ 
cessing  operations.  Systolic  processors  are  char¬ 
acterized  oy  regular  arrays  of  identical  tor  nearly 
identical.:  process:  -  cells  ; facilitating  design  and 
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systo'  .  ;:r  -c  •-  .  u  '.•/••rv.bm'-.  ::  arch  tecturai 
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of  p.  2-c^T.ponent  vector  by  a  2  >  2  matrix 

The  *'■-«{  input  to  the  acousto-optic  ceil,  vector 
component  zu  produces  a  short  diffraction  grat- 
tng,  with  di .-.fraction  efficiency  proportional  to  x,, 
that  moves  across  the  cell.  When  that  grating 
segment  is  in  front  of  LED  1,  as  shown  in  Fig. 
Tb;,  the  LED  is  pulsed  with  light  energy  propor¬ 
tional  to  matrix  coefficient  alif  and  Integrating 
Detector  l  is  illuminated  with  light  energy  in 
proportion  to  ■  h*>  product  aux.  The  next  critical 
moment  occurs  when  the  xx  grating  segment  is  in 
front  of  LED  2  and  a  second  grating  segment, 
with  diffraction  efficiency  in  proportion  to  vector 
component  x2,  has  moveo  in  front  of  LED  1.  as 
shown  m  Fig.  3(c).  At  that  moment  LED  1  is 
pulsed  with  light  energy  in  proportion  to  ol2  and 
LED  2.  with  light  energy  in  proportion  to  a2i  The 
integrated  output  of  Detector  1  is  now  proportion¬ 
al  to  a11x1  -  a12x 2,  which  is  the  output  vector 
component  v.;  the  integrated  output  of  Detector  2 
is  021X1.  The  final  critical  moment  in  the  compu¬ 
tation,  shown  in  Fig.  3(d),  occurs  after  grating 
segment  .r2  has  moved  in  front  of  LED  2.  A  final 
pulse  from  that  LED  in  proportion  to  a22  yields  at 
the  output  of  Detect-'-  2  a  voltage  in  proportion  to 
a^iX;  -  the  second  component  y2  of  the 

output  vector. 

Much  like  the  Stanford  QMVM,  the  systolic 
optical  processor  described  has  a  dynamic  range 
and  accuracy  determined  by  the  sources,  modula¬ 
tor,  and  detectors.  Output  accuracy  is  limited  to 
eight  to  ten  bits.  A  realistic  processing  capability 
’or  such  a  system  would  be  the  multiplication  of  a 
100- component  vector  by  a  100  x  100  matrix  in 
approximately  10  p.s.  This  is  much  slower  than 
the  Stanford  tiroressor  speed;  however,  unlike  the 
latter,  the  systolic  system  does  not  require  a  2-D 
SLM,  and  the  matrix  can  be  changed  with  each 
operation. 

Shortly  after  the  development  of  the  optica) 
systolic  matrix-vector  multiplier,  two  important 
advances  took  place — the  invention  of  optical 
natr-x  matrix  multipliers  ‘see  box:  "Matrix-Ma¬ 
trix  Multipliers"'  by  Dias;  by  Athale,  Stilweil. 
an c  -  ..-1’  ns.  by  Boeker  Bmmiey.  and  Caulfield; 
and  by  C-vmse-t-  and  the  achievement  by  Guii  ■ 
fovle;  by  .v-ha’.e.  Collins,  and  Stilweil:  and  by 
Cocker,  of  digtta'  accuracy  with  optical  algebraic 
processors. 

Ore  method  for  obtaining  high  digital  accuracv 
a.'  ng  mticai  p-icesso’-s  is  to  implement  digital 
-,i  d-’-'-r  hv  cm '.volution.  This  n  ethod  was 
. a i  *oi-'  ;i;t;'i'.''un  of  the  opt'cal  u- 

-  process:  r.r  -  ,r  orunitv  hv  Spenser  and  White- 
-  ..  ..  o — 2n c  i -4*d  be  Psa’t.s  et  e  .  "be 
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metnoc  :s  exp..»:nec.  w.;r.  the  a. a  of r.g  •.  where 
base-.:  rr.u.t:p..cai.cr.  u:  ,.v.  r. *.r.‘.:jt.  :o,  J9 

ana  15  ,f;  ;,  ;r...  . '■ 

555.  1;  mg  .  •  tr.e  :  .  •>:  nor. tied 

in  nonmc..  .*•  *.  ..*v.-.  ..  nic! 

i  r  p  u  t  b  i  n  ii_r*v  n  c  •: n  ^  r >  v,o  r, :  v* :: ,  «. .  •:  n  t.  3 

COD  till  Ho  a  EfHX  t:C-  ~  \  I*  ZLjT  f  r  ‘‘  fe?  y.cf  H  1,'\T  ,  ;Ci  0.'  HVc’ 
GUTTpUc,  0.^0  Li  .00  -f  ,  3 m  *  -H  ■  ..  C  •  -  ~~y  •_!  T  i:.  un 

binary  In  the  rruxed 'binary  re g re- 5 e ntat: on ,  e.icn 
digit  represent.:,  .he  rr. a.  it _f. .or  o’’  a.  power  0:  2, 
however,  unlike  fun  cma-v.  th  :  value  oi  digit 
it  aot  restricted  to  oe  or  One  nr.ee  r.o  for 
foo\er:;r,i  rvim  gv.x-n.  mum  to  rail  binary  is 
shown  in  Fig.  45;  .  -acn  uig.t  if  me  tnutea-omary 
representation  ’...r.  ;•  .  ■  expressed  in  full  binary 
form,  and  tmese  b mar;  numbers,  appropriately 
shifted,  are  a. idee  using  a  standard  base-2  adder. 
The  real: iter!  •  ;nar-  miner  is  the 

oecirrii.i  proem*  5 bo  expre  -w  :  case 

3 1  nary  niiiLwO cation  .3  «. o ro uo.*. n  possi- 
L'**.:*  r:-xjcau3<f  o":r  i  r*  “<*  m* ?_■<. . c •.  t  i*.  o  v.nj. jTv  i’ti’jrfc* 
^p’". ta c << n  cu \  *  “-'oe  ocr.  ••  oi u- 

non  or  ^r*ai  prr *  L-.r.-j./v  .lyui;  se- 

qu«rnckfb  ’b,:;;.  51  ..  \i.S„  .  L».  Go.'-VCiU- 

non  *"»*  o.:iary  bocutsric  •.*.*.  i.  y  .  oy 

c’o.'Mito* -  .'ptic  .*nr.  •■'./;  v  -  o.'  .  v  *.  s  jin ci  0  6 

ne*'r<i  r-prcis^r/Le;!  "  ac  . .  •  jio-oproc  :*aiis, 

c?‘!:s  can  be  .w-rac^i  3.:  :  ea/v  a^r;uT‘.ioc  errlc:ency 
wuriouh  concern  br  :.o..:-r.ear  rtzv Arm  nur.ner- 

iTiO !'  e  1  *  Vi*  1 H  j "  p  ,  I  ■?<  J  U*  •!  >  '  ’  F  0  n  .  v  V ;  C  Ul.r  .Cl  1 4j  n  ii  v  e 
s mi fi  c i e  n:  p'  "r ;  •. c.  •- a-i •  x  o  i ; ; .  .y u i  -  n  v>« • "  v.  c* e  n 

.t  proa;,  m;::  a-  .*  ..:f  ••  V*;.r  ... 

p u to ,  a i  n  '  r. c  .* :< rr. . : ;  •;  •  ae  r t.*o  r  i.  m u t. 

•  ve  ,.r.  ger.erd  h  h;  r.rs-.-v  ’.-.a;  .V 
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TAi  NlccD  fOR  m:GH  ACCURACY 

Corvpnt-‘?rs  c^.cu.aie  Dy  ;ht* 

o kjcia u i r.o  uOJ. ...  «#r.  r.  a  . ...' f  o : r. . . .  t . .  .j  . .  c  c  ■ 

nor-.,  dr.  s  o::-  -i.a.  h-j-r.^n-  o. 

each  CL.iCU.nj  non  r.ao  i  i;4  .t  ^r*c.*r 

cs. i n cy  c r  ci  i  c r.  -<*-■  r* ^ i r. y  r.  *. r* t*  r.  i>*.i :* .  x- 

. de.r,  anc  ;:a:uf  cd  the  n-v.a.;ea  n.-:-. a:. oris, 
cheat  errors  can  oe  muhip.ito  ,/r  an; 

i  h^b  is  wny  j 2-  anc  *;von  h**—  mt  acraracy 
con* ocita tio Hb  are  somtrCimeb  none  evtr*  wmerc  a 
6-bit  answer  wnl  soiT.ce.  Th.s  .s  also  why  ana¬ 
log  solutions  (electronic  or  opt. call  to  algebraic 
problems  must  oner,  be  avoided. 

In  electronics,  analog  computers  are  used  for 
high-speed,  easily  implemented  operations,  but 
digital  computers  are  used  for  algebra.  Not 
surprisingly,  optical  computation  makes  the 
■  same  division  of  tasks 


levels  be  distinguishable  at  the  output.  Negative 
numbers  can  be  handled  using  2’s  complement 
antnrr.euc  or  other  methods. 

The  accve  method  for  digital  multiplication  by 
convolution  can  be  used  in  a  variety  of  ways  in 
algebraic  optica;  processors  to  obtain  higher  ac¬ 
curacy,  albeit  at  the  cost  of  lower  processing 
rates.  A  digital-accuracy  matrix-vector  processor 
conceived  by  Guilfoyie  achieves  high  processing 
rates  by  using  multitransducer  acousto-optic 
cells  Athaie,  Collins,  and  Stilwell  have  imple¬ 
mented  digital -accuracy  outer-product  matrix - 
matrix  multipliers  using  a  single  pair  of  acousto- 
optic  cells. 

Current  r®s<ecrch  and  new  directions 

efforts  undertaken  curing  the  next  few  years  will 
oe  .n  two  directions.  First,  optical  matrix  comput¬ 
er  systems  rased  on  the  concepts  we  have  been 
describing  wii;  be  built,  tested,  improved,  and 
applied  to  new  areas.  Second,  new  types  of  non¬ 
matrix  ••optical  computers  will  be  developed.  We 
w  .  to-.. op,  on  tx.tn  u;  mese  uirec'ions  briefly 
; '  ■  .  - . c a .  i.i. it,-  .a  ci.upu.ers  tne  two  tr.pusts  are 
,  n . . t- n ,e r. tatiij .0  ana  extension,  i  o  date,  very 
-ittie  irr.piemer.rat.on  has  taken  mace  Doing  this 
*vi  :-'t;  r.re  noth  time  and  money;  it  now  appears 
f.r  -ne>e  w,  re  provided  Practical  issues  -v 
•to rriorien.  ,-.e:oct;  m,  electronics,  and  system 
'<-,.“at,i  r*  must  ar.c  wu!  be  lactu.  However,  me 
•i  -a  nractira*  ipticru  matrix  nrocessors  it.  1 

••  i>  w  Complex  op  .  ations  r.ust  be 

-  -•  ,r. *u  e  p , ,  r  (  n . ,1 ; .  . . . t .  i* , hainiai;  . i . * <? r i : - 
..-  -.  me  ms  ‘or  i  .-ijining  me  statistical!;.-  be- 
e  • ..  or.!  t-i  mv  . :  .:  •  .-  .,.*•  dan  •  f  , 


o  JkrMmettmtiks* 


process  governed  by  a  known  differentia)  equa¬ 
tion  and  measured  in  a  fixed  way  with  known 
measurement  statistics.  Because  a  single  "cycle1’ 
of  a  Kalman  filtering  operation  involves  many 
matrix  calculations,  real-time  Kalman  filtering 
must  be  restricted  to  relatively  small  problems. 
Performing  the  matrix  operations  -  triple  multipli¬ 
cations,  inversions,  etc.)  optically  may  permit  the 
handling  oflaige  problems  in  real  time.  Casasent 
has  started  this  effort,  and  several  others  are 
working  on  it.  Either  floating-point  operations  or 
on-the-fly  seal."  adjustment  is  needed.  Caulfield 
has  shown  that  both  are  possible,  but  his  solu¬ 
tions  are  probably  more  existence  proofs  than 
final  answers.  New  algorithms  are  needed  to 
extend  the  range  of  applications  and,  possibly,  to 
speed  up  calculations.  To  date,  all  important 
algorithms  have  been  iterative.  Noniterative, 
fully  parallel  solution  of  linear  equations  is  possi¬ 
ble  in  analog  optical  processors.  Can  similar 
things  be  done  for  digital  optical  processors? 

Nonmatm  optical  processors  are  developing 
independently  and  rapidly.  Perhaps  the  most 
widely  pursued  of  these  is  the  use  of  optics  to 
make  arbitrary  interconnections  among  electron¬ 
ic  (Goodman)  or  electro-optic  (Lohmann,  Lee, 
Collins,  Goodman,  Sawcnuck,  Strand,  etc.)  sys¬ 
tems.  Sawchuck,  Strand,  and  their  coworkers 
have  implemented  a  variety  of  space-variant  and 
space-invariant  interconnect  patterns  using  com¬ 
puter'  holograms  to  generate  the  patterns  and 
spatial  light  modulators  to  feed  the  information 
back  into  the  svstem.  Their  system  (like  those 
due  to  Lohmann,  Lee,  Collins,  etc.)  closes  on  itself 
for  feedback.  Clearly,  however,  this  is  not  the 
only  configuration.  Feedforward  configurations 
lead  to  a  variety  of  optical  artificial- intelligence 
systems. 

The  continuing  demand  for  higher  throughput 
rates  will  drive  furore  research  toward  higher 
speeds  -and  greater  parallelism  In  these  large 
systems,  or  supercomputers,  of  the  future,  a  ma¬ 
jor  problem  ;n  achieving  high  throughput  rates 
wui  be  how  .0  ’.'.'vc dilate  generalized  communica¬ 
tions  among  the  large  number  of  processing 
umts.  rn  a  genera1 -purpose  commuter,  'he  full 
advantage  of  parnl'evsm  *  id  only  a*  'ea’ized  if 
each  processing  unit  has  direct  communication 
with  e*’e*7  mber  unit,  thus  oermitting  each  to 
handle  a  part  of  the  action  on  a  continuing  basis. 

The  Hehest  level  of  communications,  or  inter¬ 
connect  as  it  is  called,  entails  *  generalized 
crossbar  network  tnvo'vtng  S'1  ; r.-.erconnects 
available  for  ,V  processors  .V  units  -.  cmmunicat- 
ing  wnh  S  cniusi.  ,-.b  shnwr.  iv.  5  Such  a 
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Suppose  we  have  M  input  signalxlhat  w-rrwhb-to 
^connect  to  N  output  -ecetvers.  fach  (nput-ccn  ±»  1 

■connected  to  0,  1,  2.  „  N  outputs. -This  concept  is  best 
Vepresented  as  a  crossoar  arrangement)  fflee  -that  - 
known  here  for  M  =  N  =  4.  tn  fhls-xftogrcrn.We  have  - 
[teed  a  darV  spot  •  to  Indicate  connections mode.  Thus 
input  1  is  connected  to  Outputs  1  and)3^£ tc:ff-we 
regard  the  Input  as  a  vector  .  K  v--..s  - 


fond  the  output  as  a  vector 


When  we  can  write 


Xi 

X  " 

*2 

*3 

i 

L*J 

vector 

- 

~y  * 

/2 

ks 

. 

k4 

■‘ja-cc.T^v “Si-  ’ 


y  =  Ax, 


i'uny  ,-i-  - 
.-X  _ 

- w  “  * 


pwhere  A  Is  a  binary  matrix  In  the  case  just  Mulcted, 

loot" 

0  0  10 
1  0  0  0' 

J3  0  0  1_ 

ffhe  rule  for  obtaining  A  IsTimole.  Porm  a  matrix  8  wtfh 
So  1  where  every  connection  In  the  crossbar  ptot 
pccurs.  In  the- example. 

fi  0  1  0 

jO  0  °  n 
0  1 
-C  0 


r  |  • 


-.iTN, 


twhere  T  indicated  transport: 
{change). 
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MATRIX-MATRIX  MULTIPLIERS.  (a)  RUBIC  cube.  This  is  c  systolic  architecture  whose  major  components  are  a 
oasec  nonconerent  light  source,  a  spatial  light  modulator  tor  each  ci  the  two  input  matrices,  a  2-D  photodector 
array  for  -eading  out  the  output  matnx,  and  a  polarizing  beam  splitter.  The  two  light  modulators  synchronously 
march  the  matrix  information  across  the  optical  aperture,  where  the  proper  terms  superimpose  to  produce  each 
element  ot  the  output  matrix.  .  _ 

(h)  Outer  product  processor  If  one  desires  to  relax  the  dimensionality  requirements  ot  the  Input  devices,  the 
matra -matrix  prooem  may  be  formulated  m  terms  of  outer  products,  rather  than  the  customary  Inner  products. 
For  the  multiplication  of  two  <V  x  N  matrices  A  and  B,  the  ouptut  may  be  expressed  as 

C  *  £  C*  «  ' 

n  -  1 


Ofrbni  '  '  0>*>Pr»V 

Each  matrix  term  mcv  oe  taken  as  the  cuter  proaucf  between  the  nth  column  vector  of  A  ana  the  nth  row  vector 
of  B.  Thu  may  be  done  opttcaity  cs  shown  in  the  figure  employing  two  crossed  arrays.  The  summation  of  fhe  indi- 
vtduaJ  matrices  may  be  realized  via  a  2-0  time  integrating  photodector  array. 

(c)  frequency  muttlptexed.  This  Is  a  systolic  architecture  that  uses  a  linear  LED  array,  an  ocoustoopttc  ceil,  a 
fauner  transform  lens,  and  a  linear  photodetecfor  array.  Input  matrix  B  is  fed  In  the  space-  and  time- multiplexed 
fashion  shown  (rows  of  B  spatially  multiplexed  and  columns  lime  multiplexed),  while  the  matrix  A  is  multiplexed  In 
.  freauency  ana  space,  using  the  ocousto-opttc  ceil.  Each  row  element  of  matrix  A  is  placed  on  a  separate 
frequency  comer,  such  that  after  multiplication  with  the  appropriate  B  elements  via  acousto-optic  modulation. 

:  the  resulting  output  term  is  deflected  by  fhe  transform  lens  to  a  pcrricuicr  pbctodetector  element,  depending  on 
the  comer  freauency.  This  architecture  may  be  viewed  as  a  matrix -vector  multiplier  in  which  frequency 
muftiptexmg  is  used  to  expand  the  vector  to  a  matnx.  - 
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network  becomes  very  expensive  when  imple¬ 
mented  electronically  tor  large  .V,  but  the  inher¬ 
ent  parallelism  of  Doncs  ho'ds  great  potential  for 
inexpensive  and  high-speed  crossbar  switching. 

The  generalized  crossbar  can  be  expressed  ana¬ 
lytically  in  terrns  of  a  vector-matrix  multiplica¬ 
tion,  so  optical  aigebra  forms  the  basis  of  solving 
the  interconnect  problem.  For  example,  consider 
the  Stanford  OMVY  described  previously.  Let  x. 
and  y~  be  the  vectors  of  the  crossbar  inputs  and 
outputs,  respectively,  and  lot  A  represent  the 
interconnect  switch  settings.  That  is,  au  —  1  if, 
and  only  if,  the  zth/ output  is  connected  to  the  y'th 1 
input.  Otherwise,  a,.  =  0  The  OMVM  with  these 
a,/s  automatically  makes  the  desired  connections 
optically.  Note,  too,  that  numerical  accuracy  is 
net  an  issue  for  thus  application. 

The  Stanford  processor  is,  of  course,  nonpro¬ 
grammable,  therefore,  it  can  only  be  used  in  a 
system  with  a  pre-established  set  of  intercon¬ 
nects.  If  one  were  to  replace  the  matrix  filter  with 
a  real-time  device  such  as  a  2-D  spatial  light 
modulator,  then  a  swuchable.  generalized  cross¬ 
bar  becomes  a  possibility;  likewise,  the  binary 
matrix  mask  could  be  replace^  with  a  hologram. 
Going  one  step  further,  one  begins  to  envision 
generalized  crossbars  ’with  picosecond  switching 
speeds  via  real-time  four-wave  mixing  or  an 
optically  addressed  bistable  array.  Such  a  capa¬ 
bility  would  bring  us  into  a  realm  of  computer 
communications  beyond  the  widest  dreams  of 
electronic  interconnection  architects. 

A  more  structured  optical  arrangement  is  the 
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fiberoptic  lattice  filter  'Tur,  Goodman,  etc  i 
When  the  computational  problem  has  sufficient 
symmetry,  a  full  matrix  appr  ach  may  be  an 
inelegant  and  expensive  approach.  The  lattice 
filter  work  represents  an  exploration  of  simpler 
systems  for  simpler  problems.  A  very  common 
problem  ir.  algebra  is  the  evaluation  of  polynomi¬ 
als.  If  an  analog  optical  polynomial  evaluator 
could  be  built,  it  would  be  possible  to  find  the 
roots,  of  polynomials  in  a  totally  new  way:  scan 
the  independent  variable! s;  and  see  where  the 
roots  occur.  This  leads  to  a  solution  of  another 
long-standing  optical  problem  as  well.  The  quo¬ 
tient  La  is  simply  fhe  root  of  the  function 
:'l lx)  -  a,  which  can  Dt-  evaluated  efficiently  in 
polynomial  form.  Work  aloDg  this  line  is  being 
carried  out  (Verber,  Caulfield,  Ludman,  Stilweli, 
etc.}.  Since  holographic  memory  technology  al¬ 
lows  ready  content-addressable  access  to  vast 
amounts  or  data,  a  truth-table  lookup  processor 
appears  both  feasible  and  appealing.  This  ap¬ 
proach  is  now  being  studied  closely  (Gaylord 
etc.). 

Finally,  all  of  these  optical  computers  are  in 
need  of  improved  or  specialized  components.  A 
major  DARPA-sponsored  effort  to  improve  spa¬ 
tial  light  modulators  is  just  beginning.  This 
seems  likely  to  lead  to  improved  throughput  rates 
by  providing  a  2-D  medium  capable  of  1000  x 
1000  individually  addressable  modulator  ele¬ 
ments,  a  cycle  rate  'READ/WRITE  time  cycle)  of 
1  kHz,  a  dynamic  range  of  30  dB,  and  less  than 
39c  spatial  nonumformity.  Other  needs  include 
source  and  detector  arrays  that  are  compatible  in 
resolution,  intensity,  and  dynamic  range  with 
these  spatial  light  modulators  and  that  possess 
individually  addressable  elements. 

Conclusions  and  outlook 

Upon  considenng  the  broad  area  of  optical  alge¬ 
bra.  including  parallel  algorithms,  architectures, 
devices,  and  their  associated  materials,  a  larae 
spectrum  of  interesting  and  important,  research 
areas  comes  to  ":'g'-t."  As  the  national  interest  in 
the  computational  sciences  begins  to  shift  tova-c 
the  supercomputers  envisioned  for  the  ’.990s, 
will  be  vitally  important  for  the  optics  communi¬ 
ty  to  pursue  those  research  areas  for  which  optics 
holds  the  greatest  appeal,  such  as  large-scale 
matrix-matrix  or  matrix-tensor  operations  and 
processor  inter-  and  intracommunications.  We 
must  also  allow  ourselves  to  look  past  the  re¬ 
search  discussed  above  and  into  the  use  of  optics 
to  perform  real-time  circuit  reconfiguration.  For 
■‘xunoie,  light  coulc  be  ’used  to  modify  the  index 
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of  refraction  within  waveguides  in  such  a  manner 
as  to  change  channel  layouts  and  beam-control 
elements  on  a  circuit  module,  thereby  adding 
much-needed  flexibility  to  optical  computing. 
These  new  directions  are  mentioned  to  convey  to 
the  reader  something  of  the  excitement  of  a  field 
that  is  not  only  maturing,  but  also  expanding. 
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EFFICIENT  REAL  NUMBER  REPRESENTATION  WITH  ARBITRARY  RADIX 


H.J.  Caulfield,  D.S.  Dvore,  and  J.H.  Gruninger 
Aerodyne  Research,  Inc. 

45  Manning  Drive 
Billerica,  MA  01821 

ABSTRACT 

Because  most  optical  digital  computers  use  only  nonnegative  quantities, 
it  Is  of  great  interest  to  find  an  efficient  way  to  represent  real  numbers. 

For  radix  2  (binary)  numbers  the  twos  complement  method  requires  only  one 
extra  digit  beyond  that  needed  for  non  negative  numbers.  We  introduce  here  an 
arbitrary  radix  generalization. 

BACKGROUND 

Optical  computers  (1-10)  have  become  extremely  popular  because  of  their 
speed,  low  power  consumption,  and  relatively  low  volume  and  weight.  Digital 
number  representation  is  as  necessary  for  accuracy  in  optics  as  it  is  in 
electronics.  In  ootical  digital  computers  the  optimum  radix  choice  is  by  no 
raean3  clear  and  may  even  be  computer  or  problem  dependent.  For  radix  2 
(binary)  representation,  the  twos  complement  method  (11)  is  an  optimally- 
efficient  way  to  represent  real  numbers  in  that  an  N-bit  real  number  can  be 
represented  with  only  N  +  1  digits.  Obviously  no  more  efficient  representa¬ 
tion  can  be  devised.  We  have  been  unable  to  locate  in  the  literature  a  scheme 
for  representing  N  digit  radix  R  (  >  2)  numbers  with  only  N  +  1  digits.  This 
work  represents  our  attempt  at  the  needed  generalization. 
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EXPOSITION  APPROACH 


Our  exposition  will  proceed  in  two  stages  aimed  at  making  the  method 
understandable.  We  avoid  theorum  and  lemma  proving  in  favor  of  simplicity  and 
clarity.  The  method  works  only  with  even  radix.  First,  we  will  illustrate 
this  method  with  examples  from  the  familiar  radix  10  numbers.  Second,  we  will 
offer  an  explanation  which  is  radix  independent. 

NOTATION  (Radix  10') 

We  suppose  that  the  numbers  of  interest  are  of  absolute  value  less  than 
10^,  where  N  is  a  preselected  integer  such  as  4.  For  N  <■  4,  the  numbers  lie 
between  -9999  and  9999.  Thus  only  N  digits  are  needed  to  represent  the 
absolute  value.  To  this  we  add  a  single  sign  digit.  The  sign  digit  for  a 
positive  number  will  be  0,  2,  4,  6,  or  8.  The  sign  digit  for  a  negative 
number  will  be  1,  3,  5,  7,  or  9.  For  negative  numbers  we  complement  the 
absolute  value,  i.e.,  subtract  it  from  10^.  For  convenience  of  notation,  we 
give  this  new  method  the  name  "parity  sign”  and  the  normal  representation 
"arithmetic".  Table  1  shows  some  sample  arithmetic  and  parity  sign 
representations  of  the  same  number. 

ADDITION  EXAMPLES 

Let  us  add  +0012  to  +0008.  We  know  that  the  answer  is  +0020.  In  parity 
sign  we  might  have 


20012 

+  80008  (1) 
100020 
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The  last  five  digits  are  00020  which  is  one  of  the  parity  sign 
representations  of  +0020.  Now  let  us  add  +0008  to  -0012.  We  might  write 


40008 

+  99988  (2) 

139$96 


The  last  five  digits  are  39996  which  is  one  of  the  parity  sign  representations 
of  -0004. 

MULTIPLICATION  EXAMPLES 

Let  us  multiply  +0012  by  +0004.  We  might  write 


00012 
x  40004 
00048 
00000 
00000 
00000 
00048 
0004&0048 


(3) 


The  last  five  digits  are  30048  which  i3  one  of  the  parity  sign  representations 
of  +0048. 

Sow  let  us  multiply  -0012  by  +0004.  We  might  write 


99988 

x  00004  (4) 

3  99  ^  5  2 


The  last  five  digits  are  99952  which  is  one  of  the  parity  sign  representations 


of  -0043. 


EXPLANATION 


We  are  used  to  graphing  the  arithmetic  representation  of  a  number  versus 
itself  (i.e.  plotting  f(x)  *  x  in  arithmetic  notation).  Figure  1  shows  such  a 
plot  for  the  domain  JxJ  <  105 .  If  we  restrict  jxj  to  that  domain,  we  can  plot 
a  multivalued  representation  m(x)  vs.  x  as  shown  in  Figure  2.  If  we  now 
restrict  ourselves  to  m(x)>  0,  we  can  still  represent  any  number  in  |x|  _< 
i05 .  the  negative  x's  will  have  an  odd  fifth  digit.  Even  numbers  will  have 
an  even  fifth  digit.  Furthermore 

m(x  +  y)  -  m(x)  +  m(y),  (5) 

where  we  mean  by  m(x)  all  of  the  values  of  m(x)  ,  etc.  Likewise 

m(xy)  -  m(x)m(y)  (6) 

OTHER  EXAMPLES 

For  the  special  case  of  radix  2  we  obtain  a  signed  twos  complement.  Thus 
+0011  plus  -1010  (+3  -10  j.n  radix  10)  is  in  parity  sign 

00011 

+  10101  (7) 

11000 

which  is  the  parity  sign  representation  of  -0111  (-7  in  radix  10).  Thus  in 
the  binary  case  the  parity  sign  digit  is  no  longer  multiple. 
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CONCLUSION 

The  parity  sign  representation  is  easy  to  use  and  easy  to  understand.  It 
includes  the  traditional  binary  signed  twos  complement  method  as  a  special 
case  while  extending  the  one-digit-sign- indication. efficiency  advantage  to 
arbitrary  radix.  Finally,  one  must  be  careful  to  prevent  "overflow”  - 
attempted  calculation  of  numbers  greater  than  the  maximum  the  system  is 
designed  to  handle.  When  overflow  occurs,  the  numerical  part  of  the  result 
(in  our  example,  th  last  four  digits)  is  correct  but  the  amount  of  overflow 
is  undetermined. 

The  simplest  way  to  prevent  overflow  is  to  test  input  numbers.  We 
suggest  the  following,  quite-conservative  test  for  radix  r  amplitudes  which 
must  be  le3S  than  r^N.  Ve  write 

r  -  2s 

since  r  is  even.  We  ignore  the  sign  digit  and  require 

(1)  For  multiplication  both  numbers  be  less  than  r^,  so  the  first  s 
most  significant  digits  must  be  zero  and 

(2)  For  addition  both  numbers  be  less  than  s  •  r^N-l  and  therefore  the 
most  significant  dig*”  must  be  s-l  or  less. 
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FIGURE  CAPTIONS 

Figure  1:  A  representation  R(n)  of  numbers  n  satisfying  -9999  $  n  £  9999. 

Figure  2:  A  multivalued  representation  m(n).  All  values  of  m(n)  for  a 

given  n  are  equally  valid. 
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Floating  point  optical  matrix  calculations 
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Abstract.  The  recent  explosion  of  interest  and  activity  in  optical  numerical 
processing  has  occurred  despite  the  fact  that  calculations  had  to  be  carried  out 
with  integer  or  fixed  point  arithmetic.  We  show  here  that  floating  point  optical 
matrix-vector  multiplication  is  feasible. 

Keywords:  optical  computing:  matrix  calculations:  algebra :  floatingpoint  systems 
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1.  BACKGROUND  ON  FLOATING  POINT  ALGEBRA 

A  wide  variety  of  new  architectures  and  algonthms  for  optical  matrix 
operations  have  been  introduced  recently.1-6  Without  exception 
these  have  used  fixed  point  arithmetics.  The  sustained  interest  in 
these  systems  arises  from  the  high  capacity,  high  speed,  and  low 
power  consumption  of  these  optical  computers  and  from  the  fact  that 
their  fixed  point  calculations  can  be  very  accurate.7-9  Of  course,  the 
range  of  applications  could  be  expanded  tremendously  if  floating 
point  calculations  could  be  performed. 

In  floating  point  notation  (base  b)  every  number  is  written 

n  =  O.i |  i-.  ij  •  •  ■  iM  X  be  ,  (I) 

where  i ,.  i:, .  .  .  iM  are  integers  between  0  and  b  —  I,  M  is  a  preset 
integer,  i,  ■*  0,  and  e  is  an  integer.  We  call 


we  have 

n,n2  =  m|tn2be|  +  ej  .  (M 

Neither  the  mantissa  multiplication  (m,m2)  nor  the  exponent  addi¬ 
tion  (e,  +  e2)  is  difficult  to  achieve  optically.  What  is  necessary  but 
far  more  difficult  optically  is  adding  n ,  and  n2.  Obviously, 

n,  +  n2  =  m,be'  +  m2bCj  =  m3be5  .  (6) 

In  a  computer  one  finds  the  larger  of  e|  and  c2.  Without  loss  of 
generality,  we  assume  e,  >  e2. 

Clearly, 

n,  +  n2  =  m,bC|  +  m2bCjbe'b~e'  =  (m,  +  m2be2_e2)be'  .  (7) 
Therefore, 

e3  =  e,  ,  (8) 

and  m3  is  calculated  by  rounding  off  m,  +  m2  bej  e'  after  M 
places.  We  see  no  obvious  way  to  do  those  steps  optically,  so  we  have 
adopted  a  new  but  largely  equivalent  approach. 


m  =  O.i,  u 


the  mantissa  and  e  the  exponent.  With  two  numbers  of  the  form 
n,  =  m,be'  ( 


n-  =  m-be-’ 
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2.  BACKGROUND  ON  OPTICAL  VECTOR-MATRIX 
MULTIPLIERS 

The  prototypical  modern  vector-matnx  multiplier  is  that  of  Good¬ 
man  et  al.1  More  recently  systolic  and  engagement  versions  havo 
been  introduced  to  simplify  hardware  and  speed  up  the  operations  •'  1 
All  of  these  start  with  a  linear  array  of  N  discrete  incoherent  light 
sources  representing  the  input  vector  components  and  produce  a 
linear  array  of  N  discrete  points  of  light  (each  of  which  is  detected  on 
a  discrete  detector)  to  give  the  N  components  of  the  product  vector 
The  Goodman  processor  calculates  the  full  matrix  “instat.  iy,"  whiv 
the  systolic  approaches  require  time  integration  over  N  pulses  io 
arrive  at  the  final  answer.  The  floating  point  need  is  present  in  boih 
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3.  DI  AL  REPRESENTATION  APPROACH 

The  approach  proposed  here  is  limned  to  the  systolic  processors.  The 
key  idea  is  to  use  different  means  for  representing  input  and  output 
numbers  optically  The  input  vector  components  are  represented  by 
encoding  both  their  mantissas  and  exponents  as  source  brightnesses 
in  separate  and  parallel  processors.  One  processor  does  nothing  but 
multiply  mantissas  A  similar  but  separate  processor  adds  the  expo¬ 
nents.  We  assume 

-em<e,.e;<em  (9) 

and  encode  e  as 

f  =  e^em.  *  (10) 

clearly. 

0<f<2em  (II) 

The  signal 

f,  =  f,  *f:  =  2em  +(f,  ->-f,)  (12) 

is  used  todrive  an  optical  light  deflector  to  one  of  (2em  •+■  I )  possible 
positions  (one  for  each  possible  value  of  f ,  +  f;)  spatially  normal  to 
the  line  of  output  vector  component  points.  A  two-dimensional 
detector  array  [N  vector  components  by  (2em  +  I)  exponents] 
receives  the  deflected  light  and  integrates  over  the  required  N  pulses. 
Then  for  each  vector  component  we  call  the  highest  nonzero  value  of 
an  exponent  e0  We  then  take  the  time-integrated  mantissa  products 
on  that  detector,  add  to  that  I  /  b  of  the  products  for  the  next  lowest  e 
value,  etc.,  until  we  have  added  all  of  the  products.  That  weighted 
sum  we  call  s0.  We  then  write 

n,  +  nj  =  S0b'»  .  (13) 


While  this  looks  in  form  like  Eq.  ( I ).  the  condition  in  Eq  ( I )  that 

1,  #  0  may  not  hold.  Thus,  we  may  not  have e0  =  e,  Nevertheless, 
this  is  a  floating  point  operation  with  all  of  the  accuracy  advantages 
thereof 

4.  UNRESOLVED  PROBLEMS 

Two  major  problems  with  this  technique  remain  unsolved  First, 
simple  electro-optical  light  deflectors  are  very  fast  but  do  not  give 
many  resolvable  spots  (limiting  em).  while  mechanical  or  acousto¬ 
optic  light  deflectors  give  many  resolvable  spots  but  may  slow  up  the 
system  too  much.  Thus,  the  choice  of  deflector  is  critical  and  diffi¬ 
cult.  Second,  because  one  spatial  dimension  is  used  for  the  exponent, 
it  is  by  no  means  clear  if  this  technique  is  extendable  to  the  modern 
optical  matrix-matnx  multipliers4  5  which  already  require  a  two- 
dimensional  detector  array. 

5.  PERSONAL  CONCLUSION 

It  has  been  my  experience  that  an  "existence  proof  (such  as  I  have 
offered  here  for  optical  floating  point  algebra)  invariably  produces 
almost  immediate  improvements  by  others.  I  trust  and  hope  this  will 
happen  here 
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APPENDIX  K 


FLOATING  POINT  OPTICAL  COMPUTATION 
FOR  ALL  MATRIX  OPERATIONS 


Spatial  encoding  for  optical  floating  point  computation 


H.  John  Caulfiekj 


Following  the  lead  of  electronic  computers,  optical  computers  must  adopt  floating  point  calculation  to  allow 
for  both  high  accuracy  and  high  dynamic  range.  Given  here  is  a  method  for  using  spatial  encoding  for  that 
purpose. 


I.  Introduction 

High  numerical  accuracy  is  required  for  most  alge¬ 
braic  calculations.  Long  ago  this  forced  electronic 
computer  designers  to  adopt  digital  rather  than  analog 
number  representations  and  to  introduce  floating  point 
calculations.  Recently  optical  computer  designers  have 
devised  a  number  of  ways  of  using  digital  number  rep¬ 
resentations. 1-4  Thus  the  remaining  step  is  floating 
point  calculation.  Although  a  preliminary  step  toward 
floating  point  optical  computing  has  been  taken,5  no 
universally  applicable  method  is  known. 

II.  Basic  Concepts 

The  basic  idea  of  floating  point  operation  is  to  rep¬ 
resent  a  positive  number  by 

n  *  m  x  b*.  (1) 

where  m  -  a  positive  number  within  a  well-defined 
range. 

b  =  a  fixed  positive  integer  called  the  base  or 
radix,  and 

e  =  a  real  (positive  or  negative)  integer  called 
the  exponent. 

( Negative  and  even  complex  numbers  are  easily  repre¬ 
sented  also,  but  this  would  be  an  unnecessary  digression 
here. )  In  an  electronic  digital  computer  one  normally 
keeps  b  >  m  >  1.  For  optical  computing  we  may  relax 
that  requirement  slightly. 

Now  consider  two  numbers: 

rii  ■  m,  x  o»',  (2) 

m  mj  x  6*’  (3) 
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The  two  operations  optical  computers  perform  are  ad¬ 
dition  and  multiplication.  Multiplication  is  the  easier 
task.  We  have 

n,  X  n2  m  (iri!  X  m2)  X  (4) 

We  already  know  how  to  multiply  mi  by  m2.  We  need 
to  add  e  1  and  e2  at  the  same  time.  Adding  two  integers 
of  moderate  size  can  be  done  either  electronically  or 
optically.  The  only  problem  appears  to  be  that  of 
bringing  mi  X  m2  back  within  the  desired  range  before 
subsequent  calculations.  This,  it  appears,  will  be  a 
recurrent  problem  in  optical  floating  point  operations. 
If 

b  >  1,  (5) 

then 

i1  >  ni|  X  nij  >  1.  (6) 

If  we  have  time  to  test  mi  X  m2  we  can  either  use  it  (if 
mi  X  m2  <  b)  or  divide  it  by  b  (if  b2  >  mi  X  m2  >  b)  and 
replace  e\  +  e2  by'ej  +  e2  +  Ci- 
Adding  to  rt2  is  more  difficult.  In  an  electronic 
computer  we  calculate 

ni  +  n2  ■  m\br'  +  m^b'*  (7) 

If  we  determine  e\  >.  ez,  then 

di  +  •  Im,  +  m2b,,~,')b,y  (8) 

Since 

&•*-*>  <  1  (9) 

(for  ei  >  e%),  this  means  attenuating  m2  before  adding 
it  to  mi.  Finally,  in  an  electronic  computer,  we  round 
off  mi  -l-  m2b,s-ei  to  the  desired  number  of  bits.  Thus 
there  is  a  nonlinear  decision  step  which  is  difficult  to 
implement  optically. 

These  difficulties  are  compounded  by  the  fact  that 
all  optical  matrix  algebra  computers  involve  accumu¬ 
lating  products,  e.g., 

n  I'll  x  nil  +  mj  x  n,)  +  x  •  10' 
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That  is,  multipie  products  must  be  added.  What  fol¬ 
lows  is  a  solution  (indeed  several  solutions)  to  this 
problem. 

III.  Vector-Matrix  Multipliers 
Here  we  use  optics  to  calculate 

•4x  »  y  till 

or 


J,1!  a,,xr  U2) 

/•i 

We  assume  that  all  a„  and  Xj  values  a re  furnished  in 
floating  point  form. 

We  must  begin  with  a  naive  and  totally  fallacious 
solution.  We  could  introduce  an  attenuator  with 
transmission  6ei+*2  before  the  detector  of  m^mi.  This 
places  the  whole  burden  on  the  dynamic  range  and  re¬ 
peatability  of  the  multiplier.  That  is,  it  offers  no  im¬ 
provement  over  fixed  point  operation. 

We  conclude  that  we  need  a  separate  detector  for 
each  Ci  +  «2  value.  Then,  in  final  readout,  the  accu¬ 
mulated  values  in  each  e1  +  ei  bin  are  first  thresholded 
to  eliminate  pure  noise  and  then  added  with  appropriate 
weights  to  give  the  final  result. 

Let  us  illustrate  with  the  following  b  =  10  example: 

fl  2 

A  - 

12  10 

x-  2  •  (13) 

110 


fl  X  2  +  2  X  10 

[2  x  2  +  10  x  10 


We  write 


Then 


fl  x  10° 

A  ■ 

[2  x  10*- 


x  » 


(2  x  io® 
[l  X  10* 


2  x  10® 

1  x  to1 


(14) 

(15) 


[1  x  2  X  10®*®  +  2  x  l  x  10°*'  fy, 
y  *  ■  (16) 

[2  X  2  X  10°*®  1  X  1  X  101*1]  [yy 

We  suppose  that  there  are  at  least  three  detectors  for 
each  v,  corresponding  to  10°.  101.  and  102  products.  We 
then  operate  electronically  according  to  the  decision  tree 
of  Fig.  1.  Clearly  we  can  achieve  as  many  exponent 
sums  as  we  have  detectors. 

In  optical  vector-matrix  multipliers  each  yi  is  de¬ 
tected  by  a  single  detector.  To  allow  floating  point 
calculation,  we  need  to  replace  each  single  detector  with 
multiple  detectors — one  for  each  possible  exponent 
sum.  One  way  to  do  this  is  to  use  a  light  deflector  for 
each  y.  driven  to  deflect  the  mantissa  product  onto  the 
appropriate  detector.  This  is  the  method  of  Ref.  5. 

In  integrated  optical  matrix-vector  multipliers  the 
deflectors  might  be  buiit  n  since  the  common  base 
material  (lithium  niohote)  makes  a  good  acoustooptic 
deflector.’’ 

In  bulk  optics,  convenience  demands  many  deflectors 
on  a  single  substrate.  This  is  now  practical.  Another 


Fig.  1.  Logic  for  assigning  mantissa  and  exponent  to  the  number 
accumulated  in  three  bins  (detectors). 


way  of  doing  this  is  encoding  the  exponent  sum  (com¬ 
puted  electrically)  as  a  frequency  of  modulation  on  the 
input  light  and  frequency  analyzing  the  output  with  an 
acoustooptic  frequency  analyzer. 

Unfortunately  these  methods  fail  for  matrix-matrix 
multipliers  which  already  use  2-D  detector  arrays. 

IV.  Matrix-Matrix  Multipliers 

As  just  indicated,  deflectors  seem  impractical  for 
matrix-matrix  multipliers,  so  alternatives  must  be 
considered. 

One  alternative  is  to  use  frequency  encoding  of  ex¬ 
ponent  sums  as  suggested  before  but  analyze  the  bins 
electrically.  This  is  no  real  solution  in  the  sense  that 
it  still  uses  only  one  detector  for  all  the  frequency  bins 
to  be  searched.  Indeed  we  must  accept  multiple  de¬ 
tectors  as  a  fundamental  price  to  be  paid  for  floating 
point  operation. 

Spatial  encoding  exponents  seem  to  be  a  rational 
approach  to  the  problem.  We  show  below  how  we 
might  encode  the  matrix-vector  problem  used  as  an 
earlier  example: 


K-’f 
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“1  1  1  2  2  2“ 

0  0  0  0  0  0 

1  0  0  0  0  0  0 

~  2  2  2  0  0  0 

0  0  0  1  1  1 

_0  00000- 


Exponent 

0 

1 

2 

0  ’ 
l 


(17) 


X 


p:  o  r 

j  2  0  0 

2  0  0 

0  1  0 

0  1  0 
_0  1  0_ 
0  1  2 


Exponent 


(18) 


Note  that  each  number  is  represented  by  a  3  x  3  array  of  numbers. 
In  A  the  number  is  repeated  three  times  horizontally  In  X  the 
number  is  repeated  three  times  vertically. 

Multiplying,  we  have 

dllX  i  +  OljX  2 

y  * 

P21*  1  4"  0^2X2 


h 

1  1' 

/  2 

0 

o' 

0 

0  0 

1  2 

0 

0 

1° 

0  Oj 

l2 

0 

0/ 

h 

2 

2' 

f  0 

1  ol 

0 

0 

0 

0 

1  0 

1° 

0 

°) 

1° 

1  oj 

In 

* 

2 

A 

2 

0 

°1 

1 

+  - 

1 

0 

0 

0 

2 

0 

0 

3 

3 

0 

°J 

l2 

0 

°J 

^2  0  o' 

’o  2  o' 

0  0  0 

+ 

0  0  0 

0 

0 

0 

O 

O 

U  0  o\ 


0 

L\° 


1  oi 

0  01 


'2  X  10°  +  2  X  10*j 

A  x  10°  +  1  x  102] 


22' 

104 


(20) 


as  required.  Clearly  this  extends  readily  to  the  ma¬ 
trix-matrix  case. 


V.  Conclusions 

Optical  floating  point  calculations  are  readily 
achievable  by  spatial  encoding.  Like  ail  the  other  im¬ 
provements  in  number  representation  for  optical 
computing  (capability  of  representing  real  numbers, 
complex  numbers,  and  digital  numbers),  the  price  that 
is  paid  is  a  loss  in  the  throughput  rate  at  which  numbers 
are  processed.  As  high  throughput  is  one  of  the  sup¬ 
posed  advantages  of  optical  computing,  designers  must 
exercise  care  in  system  design.  Finally,  we  should  note 
that  on-the-fly  scale  adjustment  can  achieve  many  of 
the  effects  of  floating  point  operation  with  no  penalty 
in  throughput  but  some  penalty  in  complication.7  Thus 
multiple  solutions  to  the  dynamic  range  problem  are 
now  available. 


4  0 

0  0 
0  0 


o\  00 
0  +  0  i 
0)  \0  0 


0  0 

\o  0 
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(19) 


/  4  0  o\ 

0  10 

\o  0  0 

Each  position  in  the  output  3X3  detector  array  corre¬ 
sponds  to  a  unique  exponent  sum.  Assigning  exponent 
values  as  shown  beiow 

0*0-0.  0*1-1.  0  +  2-2, 

1+0=1.  1  +  1-2.  1+2-3, 

2  +  0  =  2.  2+1=3.  2  +  2-4. 

in  the  final  result  leads  to 
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Rapid  unbiased  bipolar  incoherent  calculator  cube 


R.  P  Booker,  H.  J.  Caulfield,  and  K.  Bromley 


Presented  in  this  paper  is  one  of  several  possible  electrooptical  engagement  array  architectures  for  perform 
ing  matrix-matrix  multiplication  using  incoherent  light.  Essential  components  of  this  new  signal-process¬ 


ing  device  include  two  dynamic  light 
a  single  polarizing  oeam  splitter. 


I.  Introduction 

In  this  paper  we  present  a  new  concept  for  performing 
the  mathematical  operation  of  matrix-matrix  multi¬ 
plication  using  electrooptical  technology’.  This  concept 
is  based  on  the  pioneering  work  of  Kung1  for  performing 
matrix-matrix  multiplication  using  an  all-electronic 
'Vstoiic -arrav  architecture.  A  novel  feature  of  the 
riectroopticai  approach  is  that  it  is  not  limited  to  2-D 
architectures  as  is  the  case  when  employing  silicon 
technology  ,r.  an  electronic  implementation.  Before 
describing  the  electrooptical  approach,  let’s  briefly  re¬ 
view  prior  work  for  performing  both  matrix-vector  and 
matrix  matrix  multiplication  using  optical  tech¬ 
nique?, 

II.  Background 

1  he  u.-e  i[  optical  correlation  techniques  involving 
coherent  light  for  performing  matrix -matrix  and  ma¬ 
trix  ector  multiplication  has  been  extensively  studied 
maim-matu  uiy-  and  experimentally  demonstrated  for 
matrices  -  it  -  irrier  2.  ’  This  technique  has  the  undesir- 
aoie  feature  that,  as  the  matrix  order  increases,  the 
manner  of  unwanted  circular  distributions  of  light  ap¬ 
pearing  :n  ’he  output  plane  of  the  processor  rapidly 
escalate-  thus  reducing  the  light  available  at  those 
positions  corresponding  to  product  matrix  element  in¬ 
formation  In  addition  to  this  technique,  there  have 
been  a  mini  her  of  other  techniques  investigated  using 
aici.nerent  light  tor  performing  matrix-vector  muiti- 
pnc.i; ; .  it i  r'i >r  example,  preliminary  studies  in  this  area 
describe  tne  computation  of  I  -I)  discrete  Fourier 
'. r  1 1 1 - 1 • , r ms . *  sine,  cosine,  and  Waish-Hadamard 
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valves  operating  in  a  reflection  mode,  a  2-D  photodetector  array,  and 


transforms  as  well  as  a  variety  of  linear  filtering  opera¬ 
tions. The  technical  feasibility  of  this  particular  ap¬ 
proach  was  demonstrated  for  matrices  of  order  32  using 
an  optical  device  earlier  developed6  for  performing 
correlation  and  convolution  operations  with  incoherent 
light.  In  the  original  version  of  this  optical  correlator, 
a  single  light-emitting  diode,  photographic  film  trans¬ 
parency.  mechanical  scanning  mirror,  and  a  vidicon 
detector  were  employed.  More  recently,7  8  the  scanning 
mirror  and  vidicon  detector  were  replaced  by  a  solid- 
state  area-array  charge-coupled  device,  thus  greatly 
reducing  the  size  of  the  processor.  Matrix-vector 
multiply  operations  involving  matrices  of  order  128  are 
presently  performed  using  this  approach. 

A  second  technique  for  computing  matrix-vector 
products  using  incoherent  light  involves  the  use  of  a 
linear  array  of  light-emitting  diodes,  an  optical  trans¬ 
parency.  and  a  linear  array  of  photodetectors. 9  This 
architecture  has  the  advantage  that  the  data  vector 
information  may  be  entered  in  parallel,  thus  allowing 
for  higher  throughput  rates.  The  feasibility  of  this 
approach  has  been  demonstrated  for  matrices  of  order 
10.  Combining  this  architecture  with  a  1  -D  adder  in  a 
feedback  loop  gives  rise  to  an  iterative  electrooptical 
processor.1”  With  this  capability,  it  is  possible  to  per¬ 
form  other  higher-level  matrix  operations  such  as  the 
solution  of  simultaneous  algebraic  equations,  least- 
squares  approximate  solution  of  linear  systems,  matrix 
inversion,  and  eigensystem  determination,1 11- just  to 
mention  a  few. 

Most  recently,  much  attention  has  been  focused  on 
implementing  parallel-processing  architectures  for 
performing  a  variety  of  matrix  operations  using  exclu¬ 
sively  electronic  components.  Most  noteworthy  is  the 
work  of  Kung  on  systolic-array  architectures.1  1!;  1 
Combining  VLSIA'HSIC  technology  with  systolic-arrav 
processing  techniques  should  give  rise  to  increased 
signal-processing  capabilities  by  at  least  a  factor  ot 
100. 1  Already  a  2-1)  systolic-array  testbed  has  been 
designed  and  fabricated  for  validating  many  of  the 
proposed  architecture.,  and  algorithms  envisioned. ln  A 
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similar  all-electronics  parallel  approach  has  been  pro¬ 
posed1  using  an  engagement-array  architecture.  As 
it  turns  out,  these  new  systolic  engagement  types  of 
architecture  are  not  restricted  to  solely  electronic  im¬ 
plementations.  For  example,  an  acoustooptic  approach 
using  incoherent  light  for  performing  matrix-vector 
multiplication  employing  the  systolic/engagement-array 
architecture  has  recently  been  described. ,s  1’his  ac¬ 
oustooptic  processor  uses  a  linear  array  of  light-emitting 
diodes  for  inputting  the  matrix  information,  an  acous¬ 
tooptic  traveling-wave  modulator  for  inputting  the 
vector  information,  and  a  linear-array  charge-coupled 
device  for  computing  the  desired  output  vector  infor¬ 
mation.  This  approach. has  the  advantage  that  the 
input  vector  and  matrix  information  may  be  entered  in 
real  time. 

III.  Preliminaries 

To  illustrate  the  concept  of  matrix-matrix  multipli¬ 
cation  using  an  optical  engagement -array  architecture, 
consider  the  case  when  the  matrices  involved  have 
real-positive  elements  only  and  are  of  order  3.  That 
is, 


an  a,;  a-.a 

t>l  1  fc>!2  bil 

Cu  Cl  2  Ci3 

321  322  a;: 

bci  b22  b'v. 

= 

C21  022  C23 

a3l  a32  3.33 

b-ji  bj2  bsa 

C.11  C32  C33 

or,  equivalently, 

AS  *  C,  (2) 

where  A  and  B  are  known  input  matrices,  and  C  is  the 
desired  output  matrix.  Each  element  of  matrix  C  is 
obtained  by  the  equation 

c,»  =  t.  a,,b,*  i.k  =  1,2,3.  (3) 

j-i 

The  techniques  presented  here  certainly  apply  to  ma¬ 
trices  of  order  >3.  Order  3  matrices  were  chosen  merely 
to  illustrate  easily  the  concepts  involved.  Shown  in  Fig. 
1  is  a  2-D  array  of  photodetectors  initially  containing 
zero  charge  at  each  detector  site,  two  optica!  transpar¬ 
encies  encoded  with  the  matr,x  A  and  B  information, 
with  each  transparency  capable  of  translating  in  front 
of  the  photodetector  array  as  shown,  and  an  incoherent 
light  source  providing  a  spatially  uniform  collimated 
light  beam  comprised  of  a  time  sequence  of  equal  in¬ 
tensity  pulses.  Light  propagation  is  from  left  to  right. 
As  seen  in  this  figure,  each  optical  transparency  is 
partitioned  into  an  irray  of  rectangular-shaped  reso¬ 
lution  cells,  some  containing  the  matrix  A  and  B  infor¬ 
mation,  the  remaining  being  optically  opaque.  Those 
ceils  containing  matrix  information  each  have  an  in¬ 
tensity  transmittance  proportional  to  the  magnitude  of 
the  corresponding  matrix  element  located  at  that  cell 
as  depicted  in  Fig.  1.  At  any  one  instant  in  time,  only 
a  3  X  :t  array  of  resolution  cells  in  each  transparency  is 
illuminated  by  a  single  Lght.  ou'se  of  short  time  dura¬ 
tion.  The  resulting  spatially  modulated  light  beam 
impinges  or  the  prv -tod elector  array,  whence  photo¬ 
electric  charge  is  generated  and  accumulated. 
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Fig.  1.  Optical  engagement  matrix  matrix  multiplier  using  sliding 
optical  transparencies.  I  Initial  State  i 


Fig.  2.  Optical  engagement  matrix-matrix  multiplier  using  sliding 
optical  transparencies.  (Final  State.) 


Initially  the  optical  transparencies  are  so  positioned 
that  the  first  light  pulse  passing  through  the  system 
passes  through  those  3X3  arrays  containing  only  the 
an  and  bn  element  information,  respectively.  The 
result  is  that  only  the  photodetector  in  the  upper-left 
corner  of  the  detector  array  receives  light.  The  amount 
of  photoelectric  charge  generated  at  that  particular 
detector  is  proportional  to  the  product  of  an  and  bn- 
Next,  optical  transparency  A  is  shifted  horizontally  to 
the  right  one  resolution  cell  width,  and  transparency  B 
is  shifted  vertically  downward  one  resolution  cell  height. 
At  this  point,  the  light  source  generates  a  second  pulse 
of  light  identical  to  the  first.  Now  the  upper-left  three 
photodetectors  in  the  array  each  generate  quantities  of 
photoelectric  charge  proportional  to  the  product  of  the 
transmittances  of  those  resolution  cells  directly  in  front 
of  each  detector.  This  process  continues  in  this  manner 
until  the  optical  transparencies  have  physically  trans¬ 
lated  past  the  detector  array  as  shown  in  Fig.  2.  On 
closer  examination,  we  find  that  at  each  photodetector 
element  site  there  is  a  quantity  of  photoelectric  charge 
which  has  accumulated  that  is  proportional  to  each 
matrix  element  comprising  the  desired  matrix  C.  This 
then  represents  a  simple  version  of  the  engagement- 
array  architecture  for  performing  matrix-matrix  mul 
tiplication  using  two  optical  transparencies  which 
physically  translate  across  the  face  of  a  fixed  photode- 
t ec tor  arrav. 
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IV.  Proposed  Electrooptical  Configuration 

The  architecture  just  described  lor  performing  ma¬ 
trix-matrix  multiplication  using  an  optical  engage¬ 
ment-array  approach  was  primarily  examined  for  the 
purpose  of  illustrating  the  basic  concepts  involved. 
Unfortunately,  this  architecture  lacks  the  capability  of 
updating  or  changing  the  input  mat  c-»s  A  and  B  in  a 
real-time  manner.  This  is  principally  because  most 
optical  transparencies  are  made  of  photographic  film. 
Of  course,  one  way  around  this  difficulty  is  through  the 
use  of  light  valves  whose  optical  properties  can  be 
changed  in  real  time  by  electronic  means.  That  is,  if  we 
simply  replace  the  translating  optical  transparencies  by 
stationary  li^ht  valves  whose  transmission  character¬ 
istics  can  be  changed  and  updated,  matrix-matrix 
multiplication  can  be  performed  without  the  need  for 
translating  components. 

A  compact  architecture  based  on  these  ideas  is  illus¬ 
trated  in  Fig.  3.  The  basic  components  required  for  this 
system  concept  include  a  polarized  incoherent  colli¬ 
mated  light  source  with  the  same  properties  as  before, 
a  polarizing  beam  splitter,  two  light  valves  operating  in 
a  reflection  mode,  and  a  2-D  array  of  photodetectors 
also  with  the  same  properties  as  before.  Collimating 
and  imaging  optics  may  be  required  but  are  not  shown 
here.  The  use  of  optical  lens  elements  would  certainly 
have  to  be  employed  when  diffraction  effects  could  not 
be  ignored.  The  matrix  A  and  B  information  are 
clocked  into  their  respective  light  valves  shown  in  Fig. 
4.  The  transferring  of  the  matrix  data  within  the  light 
valves  using  this  architecture  is  analogous  in  all  respects 
to  the  physical  translating  of  the  optical  transparencies 
as  previously  described.  Again,  the  desired  matrix  C 
information  is  generated  within  the  photodetector 
array,  where  it  may  be  clocked  out  at  a  later  time. 

The  reason  for  using  a  polarizing  beam  splitter  in  this 
architecture  is  to  eliminate  light  from  propagating  di¬ 
rectly  from  the  light  source  to  the  photodetector  array 
without  first  reflecting  from  each  of  the  two  light  valves. 
Of  course,  for  this  to  be  true,  the  incoherent  light  source 
must  be  polarized  as  noted  earlier.  If  the  light  valves 
were  to  behave,  for  example,  as  reflecting  mirrors,  one 
type  of  polarizing  beam-splitter  arrangement  which 
could  be  employed  is  shown  in  Fig.  5.  The  polarizing 
beam  splitter  would  be  of  the  Gian  prism  variety.19  In 
addition,  an  input  linear  polarizer  and  two  quarterwave 
plates  would  also  be  required.  It  is  noted  that  the  exact 
electrooptical  configuration  used  for  performing  the 
matrix- matrix  multiply  operation  will  be  highly  de¬ 
pendent  on  the  nature  of  the  particular  light  valves 
employed.  Light  -emitting  diodes  or  laser  diodes  appear 
most  attractive  as  the  incoherent  light  source.  The 
photodetector  array  could  be  an  array  of  photodiodes 
or  possibly  a  photoactivated  charge-coupled  device. 

For  the  architecture  described  herein,  it  has  been 
assumed  for  the  sake  of  simplicity  that  the  elements  of 
the  matrices  A,  B.  and  C  were  real  and  positive  only. 
The  issue  of  performing  matrix  operptions  involving 
matrices  and  vectors  whose  elements  are  bipolar  or  even 
complex  using  ireoherent  light  has  previously  been 
addressed.11  1,1  These  techniques,  therefore,  could 


Fig.  3.  Key  components  of  a  solid-state  optical  engagement  array 
matrix-matrix  multiplier. 


Fig.  4.  Data  handling  in  the  optical  engagement  array  processor. 
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Fig  5.  Polarizing  beam  splitter  with  support  optics 
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Fie.  >i  Architectures  for  performing  (a>  basic  matrix-matrix  mul¬ 
tiplication  AB.  i  b i  the  matrix  operation  ABC,  (c)  iterative  processing 
using  feedback. 


easilv  be  extended  to  include  this  architecture  as  well. 
Since  the  mathematical  operation  of  matrix-matrix 
multiplication  is  so  fundamental  to  a  number  of 
higher-order  matrix  operations,  this  basic  architecture 
could  serve  as  a  modular  building  block  for  these 
higher-order  operations.  The  basic  matrix-matrix 
multiply  operation  using  the  processing  cube  structure 
presented  in  this  paper  is  symbolically  represented  by 
the  diagram  in  Fig.  6(a).  Again,  A  and  B  are  the  input 
matrices,  and  AB  is  the  desired  output  matrix.  If  it  was 
important  to  perform  the  multiplication  of  three  ma¬ 
trices.  that  is, 

•  dll  Ai2  413  b  1 1  b)2  b]3  C 1 1  Cl  2  C 1 3 

ABC  =  a  2i  a22  a23  b21  b22  b23  c2i  c22  c23  ■  (4) 

an  ail  a33  b3i  b-s-2  b33  c.u  c32  c33 

two  processing  cubes  could  be  connected  in  a  serial 
manner  as  depicted  in  Fig.  6(  b).  It  should  be  noted  here 
that  the  AB  data  must  be  both  spatiallv  and  temporally 
reformatted  between  the  two  cubes  employed  for  this 
algorithm  to  work.  The  product  of  three  matrices 
would  be  useful  for  image-processing  type  applications. 
One  example  would  be  computing  the  2-D  discrete 
Fourier  tran.-form  of  an  array  of  pictorial  information. 
Matrix  B  would  contain  sampled  values  of  the  image, 
matrices  A  and  C  would  contain  the  discrete  Fourier 
transform  kernel  information,  and  matrix  ABC  would 
yield  'he  desired  discrete  Fourier  transform.  The  ar¬ 
chitecture  depicted  in  Fig.  Hie)  would  be  useful  in  those 
areas  for  example,  using  iterative  processing  requiring 
feedback.  The  expression  A  -  AB  as  --een  in  Fig.  He 
means  A  is  replaced  by  the  matrix  product  of  \  and  B. 


As  previously  mentioned,  the  solutions  of  simultaneous 
equations,  matrix  inversion,  and  eigensystem  deter 
mination  are  representative  of  higher-order  operations 
which  can  be  performed  using  iterative  processing. 

V.  Summary 

This  paper  has  presented  the  basic  concept  ol  a  rapid 
unbiased  bipolar  incoherent  calculator  cube  (RUBIC 
cube)  for  performing  matrix-matrix  multiplication 
using  an  optical  engagement-array  architecture.  Fu¬ 
ture  work  will  address  the  implementation  of  this  ar¬ 
chitecture. 
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ABSTRACT 


The  applications  of  the  photo-activated,  the  CCD-addressed,  and  the  variable¬ 
grating  mode  liquid  crystal  light  valves  (LCLVs)  to  optical  data  processing  are 
described.  These  applications  include  image  correlation,  level  slicing,  spectral 
analysis  and  correlation,  bi-speccral  image  division,  and  matrix-matrix 
multiplication. 


INTRODUCTION 


Coherent  optical  data  processing  (CODP)  (ref.  1)  offers  many  potential 
advantages  in  image  processing  as  well  as  in  the  processing  of  wide  bandwidth 
electrical  signals  which  are  amenable  to  two-dimensional  (2-D)  form.  One  of  the 
main  limitations  of  this  technology  has  been  the  lack  of  a  fast,  high-resolution, 
real-time  spatial  light  modulator  (SLM)  (refs.  2,  3).  These  devices  impose,  on  a 
coherent  optical  beam,  a  2-D  image  that  is  derived  from  either  an  incoherent  optical 
source  (photoactivated  SLM)  or  directly  from  a  properly  formatted  electrical  input 
signal  (electronically  addressed  SLM).  While  the  first  of  these  tasks  can  be 
accomplished  with  the  photoactivated  hybrid  field-effect  mode  (HYFEM)  liquid  crystal 
light  valve  (LCLV)  (ref.  4),  the  second  can  be  implemented  by  the  use  of  the  charge- 
coupled  device  (CCD)-addressed  LCLV  (ref.  5). 

The  first  generation,  CdS-based  photoactivated  device  is  already  in  production 
at  Hughes.  A  second-generation,  fast-response  silicon  photoconductor-based  device 
is  currently  under  development  at  Hughes  Research  Laboratories.  These  types  of 
devices  operate  in  conjunction  with  an  optical  input  source,  such  as  a  CRT  or  a 
laser  scanner  to  provide  a  real-time  coherent  output  image  (ref.  6).  The  novel, 
photoactivated  silicon  LCLV  (Figure  1)  with  its  high-broadband  input  sensitivity  may 
also  be  used  for  direct  imaging  of  the  scene  and  subsequent  image  processing  (e.g., 
for  robot ics ; . 
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In  COOP  applications  (such  as  radar  signal  processing  or  real-time  matched 
filters),  it  is  desirable  to  convert  the  electrical  input  directly  to  an  optical 
oucput  image  without  the  intermediate  step  of  first  converting  to  an  input  image  w\t 
a  CRT.  To  realize  this  function,  we  have  designed  and  developed  a  novel  type  of 
CODP  inputting  device  that  uses  a  CCD  array  to  serially  load  and  store  a  full  fraa* 
of  analog  electrical  information  which  is  subsequently  transferred  in  parallel  to  « 
Liquid  crystal  (LC)  layer  (Figure  2).  The  elimination  of  the  CRT  (or  equivalent 
process)  from  the  ODP  system  greatly  simplifies  the  system;  in  particular,  it 
eLxminaces  several  of- the  drawbacks  associated  with  it,  such  as  geometrical 
distortions,  stability,  and  jitter.  This  device  can  be  used  with  both  coherent  and 
incoherent  readout  sources,  extending  in  spectral  range  from  the  near  ultraviolet  to 
the  near  infrared. 

In  the  following  section,  some  applications  of  both  the  silicon  photoactivated 
LCLV  and  the  electronically  addressed  CCD— LCLV  to  ODP  will  be  described.  These 
applications  include  image  correlation  and  level  slicing,  spectral  analysis  and 
correlation,  bi-spectral  image  division,  and  matrix-matrix  multiplication. 


OPTICAL  PROCESSING  APPLICATIONS  OF  THE  SILICON  LIGHT  VALVES 
Image  Correlation  and  Level  Slicing 

Optical  data  processing  is  applicable  in  two  main  categories  of  data 
processing:  the  processing  of  wideband  serial  signals,  and  in  2-D  or  image 
processing.  The  photoactivated  device  is  most  effectively  used  in  image  processing 
applications,  while  the  CCD-addressed  spatial  light  modulator  can  be  used  in  both  of 
these  categories. 

One  example  of  image  processing  is  that  of  correlating  an  image  with  a 
reference  pattern,  as  shown  in  Figure  3.  Here  the  images  analyzed,  A(t)  (in  video 
form),  and  the  reference  image,  B(t),  are  correlated  using  a  joint-transform 
technique  (ref.  7).  The  two  CCD-SLMs  are  used  as  the  electro-optic  transducers  to 
generate  real-time  coherent  optical  images  in  which  amplitudes  are  superimposed  in 
the  Fourier  plane.  The  intensity  at  the  input  to  the  photoactivated  device 
contains,  among  other  terms,  the  multiplied  amplitudes  of  the  two  Fourier- 
transformed  images.  The  photoactivated  LCLV  is  then  used  to  retransform  the 
multiplication  image,  resulting  in  the  correlation  required. 

An  important  application  of  the  photoactivated  silicon  LCLV  is  direct-scene 
imagery  followed  by  coherent  processing.  This  function  is  required,  e.g. ,  in  robot 
vision  systems.  Here,  one  can  utilize  the  two  important  features  of  the  silicon 
device:  (1)  its  broadband  sensitivity  (400  to  1,100  run,  with  typically  50  uW/cm~  at 
540  nrn) ;  and  (2)  its  fast  time  response,  permitting  fast  scenery  changes  to  be 
processed.  In  the  configuration  shown  in  Figure  4,  the  input  scene  is  imaged  and 
converted  to  coherent  modulation  using  the  5i-LCLV,  and  is  subsequently  correlated 
with  a  matched  pattern  using  the  CCD-LCLV  as  a  programmable  matched  filter. 

The  use  of  tne  silicon  photoactivated  device  for  such  direct  image  processing 
further  permits  the  dual-frequency  mode  of  the  liquid  crystal  activation  to  be 
applied  (ref.  3).  This  may  result  in  cutting  the  response  cime  from  the  current 
16  ms  to  1  to  2  ms. 


Another  powerful  application  of  optical  processing  is  with  the  use  of  a  special 
jjiotoacc ivated  device:  the  variable  gracing  mode  (VGM)  SLM  (ref.  9).  The  device  is 
rased  on  the  formation  of  grating-type  regions  in  the  LC,  the  spatial  frequency  of 
w*uch  is  determined  by  the  voltage  drop  across  the  LC.  Since  a  very  high-impedance 
.-ictoconductor  is  required  for  this  light  valve,  the  silicon-MOS  configuration  is  a 
rotential  candidate. 

A  useful  application  of  the  device  is  intens icy-to-spat ial  frequency 
.-onversien,  shown  in  Figure  5.  Here,  the  device  is  used  to  level  slice  an  input 
isage  (shown  m  three  levels:  Ij,  I  2,  I 2) •  Filtering  at  the  frequency  plane  with 
“  *  F 2  (corresponding  to  I  *  I2)  results  in  the  generation  of  the  I  ■  I2  level  of 
:ne  input  image  at  the  output  plane. 

Large  Time-Bandwidth  Spectrum  Analyzer 

We  have  demonstrated  a  real-time  rf  spectrum  analyzer  with  an  extraordinarily 
v.gh  resolution  and  time-bandwidth  product  using  the  LCLV,  with  resolution  <10  2  Hz. 
T'.e  scheme  of  the  apparatus  is  shown  in  Figure  6.  The  rf  signals  were  amplified  and 
usplaved  in  raster  fashion  on  a  CRT.  The  signals  were  obviously  asynchronous  with 
:-e  rascer  scan  of  u>s  ■  20  x  10  3  sec-1  and  a  frame  time  of  7  x  10“ 2  sec.  The 
.-.coherent  optical  display  was  focused  on  the  photoconduct ive  input  of  the  LCLV 
vuch  acted  as  a  coherent-to-incoherent  transformer  as  the  output  of  the  LCLV  was 
illuminated  with  a  coherent  HeNe  laser.  This  transformation  permitted  an  optical 
Fourier  transformation  to  be  performed.  It  is  well  known  that  the  Fourier  transform 
:i  i  raster  pattern  in  time  is  a  raster  pattern  in  frequency,  as  shown  in  Figure  6. 
Dv-frequency,  Morse-coded  tone-modulated  rf  signals  from  oil  field  transmitters 
iupiayed  the  simple  textbook  A.M.  spectral  pattern  of  a  carrier  and  two  pulsating 
udebands.  More  complex  modulations  were  also  evident  in- the  display.  The 
taeoretical  resolution  is  given  by  the  ratio  of  u»9  to  the  number  of  lines,  which 
xi:h  N  *  1.4  x  10 3  lines  is  14  Hz.  Because  of  the  falloff  in  resolution  of  the 
-CIV  and  associated  optics,  the  resolution  achieved  was  somewhat  less  (80  Hz).  An 
::vious  improvement  of  this  system  will  be  the  replacement  of  the  CRT-imaging  lens 

a  CCD-addressed  LCLV. 

In  this  case,  the  ultimate,  1,000  array  CCD-LCLV  would  provide  106  point 
^solution  over  100  MHz  bandwidth  at  (real-time)  frame  rates  of  100  Hz.  Comparable 
:<rfonnance,  taking  into  account  size  and  power  requirements,  will  not  be  achievable 
Tf  *ven  the  most  advanced  digital  technology  currently  in  development  (i.e.,  VHSIC) . 

A  Real-Time  Spectrum  Analyzer /Correlator 

Another  important  application  is  real-time  spectrum  analysis  of  a  given 
»eene.  a  silicon  lignc  valve-based  system  that  can  perform  this  operation  is  shown 
;a  Figure  7.  xhe  operation  of  this  system  is  described  below. 

The  radiation  from  the  scene  to  be  analyzed,  l(W),  is  split  by  the  beam 
•slitter  in  a  Michelson  interferometer  configuration.  Two  mirrors,  a  standard  one 
a  staircase  one,  are  used.  The  interference  pattern  at  the  output  of  the 
■  "erterometer  (i.e.,  at  the  input  to  the  LCLV)  is  the  (spatial)  Fourier  transform 
■  Fne  input  spectrum.  This  is  analogous  to  a  conventional  Fourier  transform 
•?ec:rometer  (FTS’’  (ref.  10),  in  that  each  of  Che  staircase  steps  represents  one 
:;rror  location  in  a  moving  mirror  spectrometer .  The  subsequent  spatial  Fourier 
•'Astons  of  the  output  of  che  light  valve  results  in  the  spectral  analysis  of  Che 
?,'r  ar  •  imaging  array  l’i  g'iro  7  above  the  operation  of  fh®  spectral 

■■'freiacor .  The  readout  laser  beam  la  spatially  taodu  J  a  r  eO  >,y  >i,e  n,.ji  ici 


transformed  reference  spectrum  using  Che  CCD  lighc  valve.  This  modulated  beam  ij 
t hen  used  as  a  readout  lighc  for  Che  photoac t ivaced  light  valve.  At  the  input  of 
the  photoac t ivaced  light  valve,  the  spatial  mt er f erogr am  of  the  input  beam  vs 
present.  The  emerging  output  beam  consists  of  a  multiplication  of  the  input  and 
reference,  Fourier-transformed  spectra  presented  by  the  CCD-LCLV .  The  subsequent 
inverse  Fourier  transformation  carried  out  by  the  lens  results  in  the  appearance  of 
correlation  and  convolution  terms  of  the  two  spectra  at  the  imaging  array.  This 
system,  which  is  based  on  the  FTS  principle,  benefits  from  two  important  advantages 
of  the  FTS  system,  namely,  the  multiplexing,  or  the  Felgect's  advantage  in  signal- 
to-noise  racio,  and  the  throughput,  or  the  Jaquinot's  advantage. 

An  attractive  feature  of  this  system  is  chat  it  can  be  used  for  pattern 
recognition  purposes  with  a  flip  of  a  mirror.  In  this  way,  the  pattern  of  the 
incoming  beam,  rather  than  its  spectral  content,  can  now  be  analyzed  and  correlated 
with  a  suitable  reference  image  presented  by  the  CCD  light  valve,  as  in  Figure  4. 

The  system  can  thus  perform  both  spectral  and  pattern  correlations  of  the  scene. 

The  spectral  range  of  this  syscem  is  limited  by  the  photoactivated  light  valve 
since  it  must  be  sensitive  in  the  spectral  range  used.  The  existing  silicon  light 
valve  enables  us  to  use  the  400-nm  to  1, 200-no  range.  Since  the  detection  of  longer 
wavelengths  may  require  cooling  of  Che  light  valve,  the  LC  will  be  Che  limiting 
component  for  such  a  longer  wavelength  light  modulator.  It  is  estimated  that 
operation  up  to  3  um  can  be  achieved  using  LC  operating  at  low  temperatures. 

Possible  photoconductor  candidates  for  such  IK  lighc  valves  are  Ge,  InAs,  InSb,  or 
extrinsic  silicon,  depending  on  Che  cutoff  wavelength  required. 

The  spectral  resolution  largely  depends  on  the  manufacturing  of  the  staircase 

mirror.  One  could  conceive  more  chan  10,000  elements  of  resolution.  It  should  be 

pointed  out  that  for  the  photoactivated  and  CCD-addressed  light  valves,  a  resolution 
on  the  order  of  106  elements  is  possible. 

One  obvious  limitation  for  Che  application  above  is  the  intensity  of  the  input 
beam,  or  the  radiacion  level  from  the  scene  analyzed.  Using  the  silicon  light 

valve,  a  rough  estimate  for  the  input  illumination  level  required  is  100  yW/cm2  in 

the  visible  spectral  region.  Projected  performance  of  such  a  correlator  for  two 
spectral  regions  is  presented  in  Table  1.  Finally,  it  should  be  pointed  out  thac 
other,  possibly  more  efficient  methods  of  self-interference  of  the  incoming  analyzed 
beam  have  been  previously  suggested  (ref.  11). 

A  particularly  important  type  of  signal  processing  in  which  the  CCD-LCLV  may 
be  used  is  radar  signal  processing.  This  field  encompasses  ambiguity-function 
generation  and  3ynchecic  aperture  radar  (SAR)  processing. 

An  ambiguity-function  generation  system  using  two  LCLVs  was  previously 
described  (ref.  12).  The  replacement  of  Che  photoactivated  LCLV  by  a  CCD-addres sec 
LCLV  will  significantly  improve  the  syscem,  elimating  Che  CRT  and  the  acousto-opt ic 
units  required. 


The  3i-Spectral  Imaging/ Image  Division  Syscem 

Anocher  potential  application  of  Che  Si-LCLV  for  combined  spectral  and  scene 
analysis  is  Che  Bi-Spectrai  Imaging/Image  Division  System.  The  purpose  of  this 
system  is  to  obtain  the  (logarithmic)  image  of  the  intensity  ratio  of  the  scene  at 
two  wavelengths  in  the  400-nm  to  1100-aa  spectral  range.  This  operation  results  :a 


the  enhancement  of  specific  textures  in  the  scene.  Thus,  it  has  applications  in 
texture  recognition  such  as  the  remote  Earth- features  identification  system 
currently  under  development  by  NASA  (ref.  13).  The  schematics  of  the  Si-LCLV-based 
system  are  shown  in  figure  8.  The  operation  is  as  follows.  The  scene  imaged  by  the 
input  optics  is  split  into  two  channels  which  are  each  wavelength  filtered  in  the 
two  spectral  regions  (*lt  *2)  required  (400  nm  <  A1(  X2  <  H00  nm) .  Then  the 
filtered  images  are  spatially  modulated  by  logarithmic  halftone  screens  with 
different  spatial  frequency  for  each  channel,  Aj-Fj  and  *2-F2*  A  variable 
attenuation  compensator  placed  at  one  of  the  channels  acts  to  compensate  for 
intensity  imbalance  between  the  two  channels.  The  two  images,  each  modulated  by  a 
different  spatial  carrier,  are  then  recombined  at  the  input  to  the  silicon  liquid 
crystal  light  valve.  Thus,  each  of  the  two  images  at  the  two  different  wavelengths 
is  "tagged"  with  a  different  spatial  frequency  modulation.  The  photoactivated 
silicon  liquid  crystal  -light  valve  acts  as  a  sensitive,  broadband,  incoherent-to- 
coherent  image  converter.  A  spatial  Fourier  transform  is  then  performed  on  the  data 
readout  by  the  laser  beam.  The  diffractions  of  the  two  wavelength  images  will  now 
appear  separately  in  the  Fourier  plane,  due  to  the  different  spatial  carriers  for 
each  of  chose  images.  Spatial  filters  corresponding  to  each  of  the  two  halftone 
screens  are  placed  at  the  appropriate  locations  in  the  Fourier  plane.  This  results 
in  the  formation  of  logarithmic  intensity  images  following  a  retrans forming  lens 
(ref.  14).  A  180“  phase  retardation  plate  placed  at  one  of  the  filter  locations 
will  result  in  one  of  the  logarithmic  images  (X2)  having  a  reversed  phase  with 
respect  r.o  the  other.  Thus,  the  amplitude  of  this  image  formed  at  the  video 
detector  plane  will  be  proportional  to 

Aout  "  Ai<*»y>  +  a2(-x-:')  a  lo8  Xx Cx.y)  -  log  l2(x,y)  *  log  [l2/I2] 

where  I,(x,y)  and  I2(x,y)  are  the  intensities  of  the  input  images  at  Aj  and  A2> 
respectively.  The  image  amplitude  following  reconstruction  at  the  vidicon  input 
will  be  proportional  to  log  [ I( *2) /l( A ^ ] ,  i.e.,  to  the  (logarithmic)  ratio  of  the 
mages  at  A;  and  A2.  006  Co  the  high  sensitivity  of  the  silicon  photoconductor  in 

tne  silicon  light  valve  configuration  (about  40  uW/cm2),  the  imaging  system  is 
expected  to  have  sufficient  sensitivity  for  direct  imaging  of  Sun- illuminated 
scenes . 

It  should  be  noted  that  che  same  physical  region  of  the  light  valve  is  utilized 
•n  noth  channels.  This  is  done  in  order  to  minimize  non-uniformities  in  the  ratio 
image  obtained  by  the  wavefront  substraction.  Thus,  non-uniformities  associated 
vuh  amolitude  or  phase  defects  originating  in  the  light  valve  will  be  automatically 
Jubstrated.  The  "penalty",  however,  is  the  need  to  use  two  different  spatial 
frequencies,  reducing  the  bandwidth  available  for  image  information. 

The  Spectral  Range  of  the  bl-spectral  imaging/image  division  system  is  limited 
the  silicon  LCL7  (400  nm  to  1100  nm) .  As  indicated  above,  it  may  be  possible  to 
extend  the  spectral  range  of  the  silicon  device  into  the  3-  to  5-um  region. 

The  Dynamic  Range  of  this  system  is  limited  by  the  Si-LCLV,  which  is  typically 
I.  An  important  advantage  of  this  optical  processing  system  is  that  the  output 
r»:io  is  presented  by  3  coherent  light.  This  enables  a  straightforward  use  of 
:-tical  post-processing  (e.g.,  ratio  image  correlation). 

The  Spatial  Resolution  of  this  system  depends  on  the  spatial  frequencies 
raploved,  as  well  as  on  the  Si-LCLV  p.-r  f  ortnanc  e .  Taking  Fo  ”  25  cycles/mm  at  30% 


modulation  as  che  current  performance  of  the  Si-LCLV,  and  using  the  two  carrier 
frequencies,  as:  Fo/4  and  3Fo/4,  it  is  found  that  over  500  pixels  of  resolution  are 
available  using  the  43-mn  aperture  device,  with  AF  »  Fo/2. 


Application  of  the  CCD-LCLV  to  Systolic  Array  Processing 

Optical  numerical  processing  offers  a  unique  application  of  CCD— addressed 
LCLVs .  For  high  speed,  an  opcical  numerical  processor  must  utilize  spatial 
parallelism.  A  two-dimensional  data  array  offers  great  parallelism  but  can  entail  i 
significant  addressing  problem.  If,  however,  data  could  be  entered  a  line  at  a  tin* 
and  be  made  to  march  across  the  LCLV  at  the  chosen  clock  rate,  a  single  N  x  1  CCD 
line  could  address  a  full  N  x  N  data  array.  The  use  of  moving  electronic  data  in  « 
plane  for  such  numerical  operations  was  popularized  as  "systolic  array  processing" 
by  Rung  (ref.  15).  The  first  extension  of  systolic  array  processing  to  the  optical 
domain  used  one-dimensional  transducers  (acousto-optic  delay  lines  and  CCD 
detectors)  in  direct  analogy  with  VLSI  transducers  (ref.  16).  Recently,  Bocker 
et  al.  (ref.  17)  proposed  the  use  of  optics  for  systolic  array  processing  in  three 
dimensions,  which  electronics  cannot  do.  Their  Rapid  Unbiased  Incoherent  Calculate 
cube  (or  RUBIC  cube)  uses  two  electronically  addressed  spatial  light  modulators  to 
move  components  of  matrices  A  and  B  across  the  spatial  light  modulator  at  certain 
clock  rates.  One  possible  configuration  is  shown  in  Figure  9.  Because  two  pixels 
are  needed  for  real-number  representation,  we  can  multiply  the  two  (N/2)  x  (N/2) 
matrices  together  with  the  RUBIC  cube  in  (N-l)  clock  periods.  The  cube's  ability  c 
multiply  very  large  matrices  very  rapidly  with  low  power  consumption  should  make  Oil 
RUBIC  cube  very  important.  To  use  the  CCD-addressed  LCLV  for  the  RUBIC  cube,  one 
must  use  an  external  buffer  memory  which  will  feed  the  CCD-LCLV  with  one  line/colufl 
displacement  in  each  frame.  Alternatively,  it  may  be  possible  to  modify  the 
structure  of  the  CCD-LCLV  to  incorporate  an  internal  buffer  memory.  This  will 
enable  the  line/colunm  clocking  operation  required.  This  possibility,  although  not 
a  simple  t  sk,  may  also  be  desirable  for  other  applications  of  the  CCD-SLM. 
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TABLE  1.-  PROTECTED  SPECIFICATIONS  OF  THE  Si-LCLV- BASED 
FOURIER  TRANSFORM  SPECTROPHOTOMETER/CORRELATOR 

1.  VISIBLE  RANGE:  400  ran  <  X  <  1200  am 
BANDWIDTH:  if  -  16,700  cn-1 

NO.  OF  RESOLUTION  ELEMENTS:  N  *  100  X  100 
SPECTRAL  RESOLUTION:  <S£  *  1.67  cm"1 
MAXIMUM  "STROKE":  <5DMAX  a  I/<5f  *  0.6  c® 
"ROUGH"  STEPS:  <5DX  »  0.6  cm/ 100  -  60  um 
"FINE"  STEPS:  6Dy  -  60  um/100  -  0.6  um 

2.  1.5-um  TO  4.5-um  REGION 
BANDWIDTH:  if  -  4440  cm"1 

NO.  OF  RESOLUTION  ELEMENTS:  N  -  100  X  100 
SPECTRAL  RESOLUTION:  <5f  -  0.44  cm"1 
MAXIMUM  "STROKE":  dD,^  -  1/0.44  -  2.27  cm 
"ROUGH"  STEPS:  <5DX  -  227  um 
"FINE"  STEPS:  <5Dy  -  2.27  um 


STEPS  DIMENSION  (BOTH  CASES)  =0.5  am  X  0.5  am  FOR  50- am  APERTURE 


▼-Silicon 


Figure  3.-  A  joint  transform- baaed  image  correlation  system 
using  CCD-addressed  and  photoactivated  devices. 


Figure  4.-  An  imaging/scene  correlation  system  using  the 
silicon  liquid  crystal  light  valves. 


VGM  FOURIER 


Figure  5.-  Intensity  level  slicing  of  an  image  using  the  VGM  modulator. 
The  I  *  I2  level  is  reproduced  at  the  output. 
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Figure  6.-  Real-time,  large  time-bandwidth  spectrum  analysis. 
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Figure  8.-  A  bi-spectral  imaging/ image  division  system  based 

on  the  silicon  LCLV . 


APPENDIX  N 

SPECIAL  CONSIDERATIONS  FOR  OPERATING 
OPTICAL  ITERATIVE  CALCULATIONS 
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processors 
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Tht  man  lung  ..i  me  lenllMek  ,  irauirv  to  the  opucai  systolic  or  engagement  processor  permits  simple  pipelining 
of  suti.narv  iterative  .ugoritiims  as  well  as  on  the  fly  scale  adjustment  similar  in  effect  to  floating-point  calcula¬ 
tion 


Once  a  suitable  stationary  .terative  algorithm  is  cho¬ 
sen.  an  optical  systolic  or  engagement  matrix  algebra 
processor  can  be  used  to  perform  a  useful  operation, 
such  as  solution  of  ,\  linear  equations  with  .V  unknowns, 
singular  value  decomposition,  and  eigen  problems. 
Most  past  work  has  been  either  on  algorithms1  5  or  on 
processors."  1  Here  we  seek  to  complete  the  analysis 
by  showing  how  the  processor  arid  feedback  circuitry 
can  he  combined  to  achieve  pipelined  iterative  systolic 
processing  u  e  .  a  feedback  data  fl* >w  •*>  matched  to  the 
processor  input,  output  data  flow  r.ai  no  slowdown  of 
the  processor  is  required  i  That  is  'he  feedback  elec¬ 
tronics  that  implements  the  iterations  must  ne  matched 
to  the  matrix  processor  We  sc.  w  mat  such  electronics 
can  also  adjust  the  scale  of  the  pm.'iienri  during  each 
cvcle  in  such  a  wav  as  to.isso.  •  ^urnum  use  of  the  dy¬ 
namic  range  of  the  system  Thus  optical  processors  can 
operate  in  a  numerical  mode  'hat  i-  neither  integer 
.  fixed  point  i  nor  floating  poiiv  nut  much  closer  to  the 
latter  and  tnni'h  ni.y.'  'wl:;,  tn.it  ' "e  !. .rtaer 

We  consider  .»tm  matrix  veep  ,-r  >cessi,rs  ,n  Uetaii. 
bu'  extension  to  matrix  matrix  :  censors  is  .straight - 
'  rwarri.  In  a  systolic  or  engagement  processor  fur  the 
••qoat  a  'll 

-1  x  =  b  ill 

ne  x  or-.ipor.er ire  .tipu;  sequent .a-  v  anil  'alter  a 
.  ert  am  loading  t une  '  1  '  tie  pipeline  ’  t he  h  components 
ir>-  i  nit  put  sequentially  For  i  tad  minanded  optical 
•  nguct'inent  matrix  ve,  i,,r  processor.  the  first  b  com 
ii.nent  >  .  .m, acted  during  t ne  same  cluck  per.od  in 
wnich  the  last  x  aimpnnent  is  entered  It  there  is  to  he 
is  ,nwd own.  Ine  first  .  oni portent  >t  the  new  iteration 
•  x  must  in-  calculated  during  the  same  clock  time 
1  .u  ■  Me  deration  must  tie  -  ■ !  a  t.-rtn  si.  b  as 


■r 


w *ier-  *  itie,  .  .  i-,  i , x .  1 1  -!  ci  c.ir\  tond  n  uis 
I  -u;x*r  r  i  ’  ~  r>  -pr--  •  -  at .-  i.  ,  j  .  .n, :  .or  .it’- 1  I  he 

'  -  rip:  -  -epr>-  .  ,  .fi; s  We  ■.  -e  i  air  'X 
impie  ire-  1  i.  obi  :ii"  Civ  method  ior  tuiding 

■  i  ic.  ■  .  i:  -  .s  . s  :  i n . 


x  =  A  'b.  ( 4 1 

We  write 

.4  =  L  +  D  +  L  ,  15) 

where  L  and  f  ’  are  lower  and  upper  triangular  matrices, 
respectively,  and  D  is  a  diagonal  matrix.  Inserting  Eq. 
(5)  into  Eq.  (1)  and  rearranging  leads  to 

x  =  -D-hL  +  U)x  +  D-*b.  (6) 

From  this  we  obtain 

x.k-mi  =  y,Kl  +  c  ,7) 

where 

yiKi  =  BxiKi, 

B  =  -D-'\L  +  C). 

c  =  D  ~ 1  b 

t  dearly.  Eq.  (7)  has  the  form  of  Eq.  12).  For  many 
problems, 

hm  x'K'  =  x,  K  -*•  »  (8) 

A  necessary  and  sufficient  condition  is  that  the  spectral 
radius  of  B  be  less  than  one.10  Again,  this  is  just  an 
example  for  definiteness.  Many  other  stationary  iter¬ 
ative  algorithms  for  this  problem  and  for  other  prob¬ 
lems.  e.g.,  the  eigen  problem,  exist. 

An  engagement  processor  produces  the  components 
>t  x-K  +  "  in  sequence.  To  obtain  the  ith  component, 
we  require  the  following:  ilia  sequencer  that  puts  the 
proper  >,  Kl  signal  ithe  one  just  completed)  into  the 
adder.  1 2l  a  sequencer  that  puts  the  proper  c,  into  the 
adder,  di)  an  adder  of  \  .'h‘  and  c. ,  and  i4)  whatever 
amplifier  may  be  needed  to  insert  r, ,K  *  “  into  the  ma¬ 
trix  multiplier  Figure  1  shows  the  system  schemati¬ 
cally  Note  (hat  only  one  adder  and  only  one  amplifier 
are  needed 

One  problem  remains,  scaling  Let  us  define  .V/lK 
as  the  magnitude  of  (he  largest  component  of  xlKl.  Let 
us  aisu  define  M  as  (he  maximum  value  that  a  compo¬ 
nent  "t  x  can  a.  some  and  Uili  be  represented.  Usually 
the  components  of  x  are  represented  by  transmissions. 

Ml 
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Fig  l  Implementation  ol  x'*  '  -  fix'*  ■'  +  c  with  an  optical  engagement  processor  The  ftv's 
are  the  components  of  the  R  matrix. 


XI  =  1.  (9) 

It  XV  K  '  «  V/,  the  system  accuracy  is  poor.  If  .V/'Kl  t> 
XI,  the  input  saturates  and  accuracy  is  poor.  Clearly, 
good  accuracy  requires  that  ;V/'Kl  =  XI  Unfortunately 
we  do  not  know  ,V/IK'  until  after  it  is  calculated.  The 
best  that  we  can  do  is  to  use  to  approximate 

M‘h  ‘  and  scale  the  input  to  aim  at  M'H^U  =  aM,  where 
ot  <  1.  We  might  choose  a  =  0.9.  We  then  set  the 
amplifier  to  feed  back  not  jc1,k+!>  but  sx,lK+n,  where 
s  is  a  scale  chosen  to  give  M'K  * 1 1  »  aM.  Here  we  are 
using  the  fact  that,  if  Ax  =  b, 

A  ( s  x )  =  sb.  (10) 

To  find  x  from  sx  we  will  need  to  remember  the  last  s 
used.  It  may  happen  that,  despite  the  fact  that.s  <  1. 
the  next  Xf‘K'  is  equal  to  one  i  indicating  probable  sat¬ 
uration).  In  that  case  we  apply  in  the  next  iteration  a 
smaller  scale.  These  improvements  require  more 
electronics  than  the  simple  system  of  Fig.  1.  including 
some  memory  (as  it  uses  iterations  of  the  form  of  Eq. 
(3)|.  In  this  regard  the  logic  is  similar  to  that  of  using 
Kalman  filtering  in  control  systems. 

What  we  gain  is  adaptive  scaling.  If  we  can  divide  the 
output  into  S’  leveis  of  spacing  .V//.Y.  the  full  .V:  1  dy¬ 
namic  range  s  available  a:  iteration  K  only  if  X1'K'  « 
M.  So  far  as  M  h  goes,  the  adaptive  staling  makes  this 
a  floating-point  calculation.  Ail  other  components  of 
x  are  not  calculated  with  equal  accuracy  since  only  one 
scale  is  used  for  x  and  it  is  chosen  to  be  optimum  for  the 
large^t-magnitude  component  of  x. 

The  components  of  x  now  appear  to  be  calculated  in 
floating  point.  Unfortunately,  this  is  not  the  case. 
Only  the  maximum  component  is  calculated  by  a  true 
floating-point  method.  While  ensuring  that  the  largest 
component  remain  close  to  but  less  than  M.  this  ap¬ 
proach  can  reduce  other  components  to  be  much  less 
than  XI  We  seek  a  method  that  wll  scale  all  c»mp«> 


nents  individually.  A  generalization  of  the  above 
scaling  method  will  permit  a  floating-point  calculation 
on  all  components  of  x.  We  rewrite  Ax  =  b  as 

A  S  ~ 1  .S  x  =  b 

and  multiply  on  the  left  by 

<AVtS-i)Sx  =  Sb,  (Hi 

where  .S  is  a  diagonal  matrix  that  is  used  to  scale  x.  The 
diagonal  element  S„  scales  the  ith  component  of  x. 
This  matrix  also  scales  b.  If  all  the  diagonal  elements 
of  S  are  equal,  that  is,  if  .S'  is  the  constant  matrix  si,  then 
this  approach  simplifies  to  the  previous  one.  The  ma¬ 
trix  d  =  SAS~]  is  similar  to  A.  It  has  the  same  eigen¬ 
values,  determinant,  and  condition  number.  The 
condition  number  gives  an  indication  of  the  size  of 
uncertainties  generated  in  the  solution  vector  x  from 
uncertainties  in  the  elements  of  A.  The  size  of  the 
uncertainties  in  x  is  bounded  by  the  condition  number 
times  the  size  of  the  uncertainties  in  .4.  The  transfor¬ 
mation  to  G  causes  no  change  in  the  conditioning.  If 
■S’  i3  the  constant  matrix  si,  it  commutes  with  4.  and  Eq. 

•  ID  reduces  to  Eq.  1 10).  The  matrix  S  can  be  changed 
at  each  iteration  to  scale  x. 

We  define 

v*»S*x*  (I2al 

as  the  scaled  solution  vector  after  the  Kth  iteration 
and 

wkm  _  £KxK+\  i  12b) 

as  unsealed  output  vector  after  the  k  +  1-th  iteration. 
Further  defining 

ft*  =  SKB\SK)-'  1 12c) 

;  nd 


N  -5 


eh  =  [)-'-SKh 


H  2d) 
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leads  to  the  generalization  of  Eqs.  1 6 >  and  (7).  For  the 
Kth  iteration  the  two-step  process  is  to  find  wK_rl  as 

wK+i  =  +  eK  (13) 

and  then  scale  w**1  to  find  5K+l  and  vK  +'1. 

The  matrix  BK  is  similai  to  B.  It  has  the  same  ei¬ 
genvalues  as  B.  In  particular,  the  spectral  radius  of  BK 
is  equal  to  that  of  6.  Therefore  the  convergence 
properties  of  the  method  using  Eq.  ( 13)  are  the  same  as 
those  obtained  from  Eq.  (6).  In  principle,  the  method 
allows  B  to  be  updated  at  each  iteration.  If  ,V  is  the 
dimension  of  the  problem,  this  update  requires  2.V2 
operations.  However,  far  fewer  updates  may  be  re¬ 
quired.  If  w  is  not  saturated  or  if  none  of  its  elements 
is  small  compared  to  M .  then  5K+  *  can  be  taken  to  SK, 
and  no  update  of  B  is  needed.  The  dynamic  range  of 
B  changes  from  update  to  update  and  is  different  from 
B  The  elements  of  B  are  related  to  B  by  BtJ  = 
~  1  The  columns  of  B  are  scaled  by  S-1;  the 
rows  are  scaled  by  5.  The  method  permits  apportion¬ 
ment  of  the  dynamic  range  between  the  solution  vector 
v  and  the  matrix  B. 

In  conclusion,  we  have  considered  a  generalized 
scheme  that  allows  the  scaling  of  each  component  of  x 
in  the  solution  of  the  .4x  =  b  problem  and  a  simplified 
version  in  which  a  single  scale  is  used  for  all  components. 
The  proposed  method  does  not  alter  the  conditioning 
of  the  problem  I  the  G  matrix  is  similar  to  .4 ),  nor  does 
it  alter  the  convergence  rate  (the  matrix  B  is  similar  to 
B).  In  the  general  form,  however,  a  new  B  must  be 


formed  from  time  to  time,  a  computational  burden  of 
2N'-  multiplications.  If  the  simpler  form  is  used,  in 
which  all  scale  factors  are  the  same,  then  G  =  A,  B  =  B, 
and  no  additional  computational  burden  is  required. 

It  is  clear  that,  to  the  extent  that  the  dynamic  range 
of  both  the  given  problem  ard  the  given  hardware  per¬ 
mits  it,  floating-point  optical  systolic  and  engagement 
processors  are  feasible. 

This  research  was  supported  by  U.S.  Air  Force  con¬ 
tract  F19628-82-C-0068. 
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